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TARGETING GENE TRANSFER VECTORS TO 
CERTAIN CELL TYPES BY PSEUDOTYPING WITH 
VIRAL GLYCOPROTEIN 

FIELD OF THE INVENTION 

5 The present invention relates generally to compositions and methods for 

selective gene transfer, and in particular, to methods for targeting genes to certain cell 
types, comprising introducing to a cell population the gene to be transferred 
operatively-linked to an appropriate transfer vehicle, wherein the transfer vehicle is 
associated with a transmembrane form of viral glycoprotein. 

10 BACKGROUND OF THE INVENTION 

Ebola virus has been identified as the cause of several highly lethal outbreaks 
of hemorrhagic fever. Infection begins typically with flu-like symptoms which often 
progress rapidly to fatal complications of hemorrhage, fever, and hypotensive shock. 
Bowen, E.T.W. et al.. Lancet 1:571 (1977); Centers for Disease Control, M.M.W.R. 

15 44:381 (1995); Le Guenno, B. et al., Lancet 345:1271 (1995); Peters, C.J. et aL, 
Fields Virology, B.N. Fields, D.M. Knipe and P.M. Howley, Eds. (Lippincott-Raven, 
Philadelphia) p. 1 161 (1996). The negative-stranded genome of Ebola virus contains 
seven structural and regulatory proteins (Sanchez, A. et aL, Virus Res. 29:215 
(1993)), but despite its relative simplicity, the molecular basis for Ebola virus 

20 pathogenicity is unknown. Among the viral gene products, the glycoprotein is found 
in two fomis: a secreted form, 50-70 kD (Sanchez, A. et al,, PNAS (USA) 93:3602 
(1996)). synthesized at high levels early in the course of infection, and an alternative 
transmembrane form, which arises from RNA editing to encode a 120-150 kD 
glycoprotein that is incorporated into the virion. Sanchez, A. et al., PNAS (USA) 

25 93:3602 (1996); Volchkov, V.E. et al., Virology 214:421 (1995). The first 295 amino 
acids (aa) of both proteins are identical in the Zaire strain, while sGP contains an 
additional 69 and GP another 381 COOH- terminal aa residues. Sanchez, A. et al., 
PNAS (USA) 93:3602 (1996), The specific cellular targets of these related gene 
products and their roles in the pathogenesis of Ebola infection have not been 

30 characterized. 

SUMiViARY OF THE INVENTION 

The present invention provides compositions and methods for targeting gene 
transfer vectors to certain cell types by pseudotyping with a transmembrane fomri of 
viral glycoprotein. In one embodiment, the methods of the invention comprise the 
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step of administering to a cell population a gene to be transferred operatively-linked 
to an appropriate transfer vehicle, wherein the transfer vehicle is associated with a 
transnnembrane form of Ebola glycoprotein. In this embodiment, the gene will be 
targeted to cell types naturally infected with Ebola such as endothelial cells. 
5 monocytes and hepatocytes. 

Genetic constructs for selective gene transfer into certain cell types are also 
provided. The genetic constructs of the present invention comprise a gene to be 
transferred operatively-linked to an appropriate transfer vehicle or carrier, wherein the 
transfer vehicle or carrier is associated with a transmembrane form of viral 
10 glycoprotein. In one embodiment, the transmembrane form of Ebola glycoprotein is 
expressed on the surface of a virus-based gene-targeting vector, e.g., lentiviral or 
retroviral vector. In another embodiment, an expressed or synthesized 
transmembrane glycoprotein is chemically derivatized to a non-biologic gene targeting 
vehicle. 

15 Additional objects, advantages, and features of the present invention will 

become apparent from the following description and appended claims, taken in 
conjunction vVith the scccmpsnying drsvv'jngs. 

BRIEF DESCRIPTION OF THE DRAWINGS 
The various advantages of the present invention will become apparent to one 
20 skilled in the art by reading the following specification and subjoined claims and by 
referencing the following drawings. 

Figures 1A-1C show the binding of sGP to neutrophils; 
Figures 2A-2D show the infection of different cell types by a GP-pseudotyped 
vector of the present invention; 
25 Figures 3A-3F show the dependence of sGP binding on CD1 6b and correlation 

of binding with neutrophil activation; 

Figures 4A-4B show the effect of sGP on neutrophil function; 
Figures 5A-5C show the infection rate of cells with a GP-pseudotyped retroviral 
vector of the present invention; 
30 Figure 6 is a schematic of the plasmid pVR 1012-GP(IC) (Ivory Coast strain 

of GP. see SEQ ID NO: 1); 

Figure 7 is a schematic of the plasmid pVR 1012-GP(S) (Sudan strain of GP, 
see SEQ ID NO: 2); 

Figure 8 is a schematic of the plasmid pVR 1012-GP(Z) (Zaire strain of GP, 
35 see SEQ ID NO: 3); 
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Figure 9 is a schematic of the plasmid pVR 1012-sGP(Z) (Zaire strain of sGP, 
see SEQ ID NO: 4); and 

Figure 10 is a summary of the characterization of GP and sGP derivatives for 
their ability to pseudotype to induce cytotoxicity in producer cells. 
5 DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The present invention provides genetic constructs and methods for targeting 
gene transfer vectors to certain cell types by pseudotyping with a transmembrane 
form of viral glycoprotein. The methods for selective gene transfer of the present 
invention comprise the step of administering to a cell population a genetic construct 
10 of the present invention so that the gene is transferred and expressed in certain cell 
types present in the cell population. Administration to the cell population may be ex 
vivo or in vivo. 

The genetic constructs of the present invention comprise a gene to be 
transferred operatively-linked to an appropriate transfer vehicle or carrier, wherein the 

15 transfer vehicle or carrier is associated with a transmembrane form of viral 
glycoprotein. In one embodiment, the transmembrane form of Ebola glycoprotein is 
associated with the vehicle or carrier. The gens to bs transferred vvill thus be 
targeted to cell types naturally infected with Ebola virus including endothelial cells, 
hepatocytes, monocytes and related cell types such as dendritic cells. The 

20 transmembrane form of Ebola glycoprotein may be chosen from, without limitation, the 
Ivory Coast strain (SEQ ID NO: 1), Sudan strain (SEQ ID NO: 2), the Zaire strain 
(SEQ ID NO: 3) and/or the Reston strain. It will be appreciated that in other 
embodiments of the present invention, other hemorrhagic fever virus glycoproteins, 
in particular transmembrane glycoproteins, may be employed and will target those cell 

25 types naturally infected by the virus. Examples of hemorrhagic viruses include 
dengue virus. Yellow Fever virus (flaviviridae)] Lassa, Junin and Machupo 
(arenaviridae)\ Rift Valley, Congo-Crimean and Hantaan {bunyavihdae)\ and Marburg 
{filoviridae). It will also be appreciated that derivatives of the transmembrane 
glycoprotein which retain the capability of targeting specific cell types, may also be 

30 employed, for example, the transmembrane glycoproteins may be mutated, e.g., toxic 
regions may be removed to improve producer cell viability (see Figure 10). 

The transmembrane glycoprotein may be expressed on the surface of a virus- 
based gene-targeting vector, e.g., lentiviral. retroviral, replication-deficient retroviral, 
adenoviral or adeno-associated viral vector. The transmembrane glycoprotein may 
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also be expressed or synthesized and chemically derivatized to a non-biologic gene 
targeting vehicle, e.g., liposome or DNA-protein complex. 

The term "operatively-linked" as used herein refers to functional linkage 
between a nucleic acid expression control sequence (such as a promoter) and a 
5 second nucleic acid sequence {i.e., gene), wherein the expression control sequence 
directs transcription of the nucleic acid corresponding to the second sequence. 
Expression control sequences are known to those skilled in the art (see, e.g., 
Goeddel, Gene Expression Technology: l^ettiods in Enzymology 185. Academic 
Press, San Diego, CA (1990)). "Associated with" as used herein refers to the 

1 0 transmembrane form of viral glycoprotein being in contact or linkage with the transfer 
vehicle or carrier in such a way as to direct the transfer vehicle or carrier to certain 
cell types. The terms "transfer vehicle" and "carrier" refer to any type of structure 
which is capable of delivering the gene of interest to a target cell. 

Many transfer vehicles or carriers are known in the art. For example, various 

15 viruses that are capable of infecting cells can be recombinantly manipulated to carry 
the gene of interest without affecting their infectivity. As used herein, the terms 
"infect" and "infectivity" refer only to the ability of a virus to transfer genetic materia! 
to a target cell. Those terms do not mean that the virus is capable of replication in 
the target cell. In fact, it is preferable that such viruses are replication defective so 

20 that target cells do not suffer the effects of viral replication. 

In one embodiment, the virus employed is a replication defective retroviruses. 
When these replication defective retroviruses are employed, their genomes can be 
packaged by a helper virus in accordance with well-known techniques. Suitable 
retroviruses include PLJ. pZip, pWe and pEM, each of which is well known in the art. 

25 Suitable helper viruses for packaging genomes include v'Crip, y/Cre. ip2, g^Avn and 
adeno-associated viruses. 

In another embodiment, lentiviral vectors are employed. Surprisingly, the 
inventors of the present invention were successful in pseudotyping lentiviral vectors 
(HIV) with the transmembrane glycoprotein from Ebola. Feline Immunodeficiency 

30 virus, bovine immunodeficiency virus, simian immunodeficiency virus and EAIV. may 
also be employed as the carrier in the compositions and methods of the present 
invention. 

Gene delivery systems other than viruses may also be employed. For 
example, the gene to be transferred may be packaged in a liposome which is 
35 chemically derivatized to the transmembrane glycoprotein. To form these liposomes, 
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one mixes the DNA of an expression vector which expresses the gene to be 
transferred with lipid, such as A/-[1-(2.3,-dioleyloxy)propyl]-A/,A/,A/-trimethylamnnonium 
chloride (DOTMA) in a suitable buffer, such as Hepes buffered saline. This causes 
the spontaneous formation of lipid-DNA complexes (liposomes). Feigner, P,L. et al., 
5 PNAS (USA) 84:7413-7417 (1987). 

Another gene delivery system that may be utilized in this invention is DNA- 
protein complexes. The formation of DNA-protein complexes is described in United 
States Patent No. 5,166,320, the disclosure of which is herein incorporated by 
reference. 

10 It will be appreciated that any gene may be employed In the compositions and 

methods of the present invention. For example, and without limitation, in the 
treatment of cancer, death inducing genes, including genes coding for cytostatic or 
cytotoxic proteins, e.g., HSV tk, and genes encoding cyclin dependent kinase 
inhibitors. p21, p27, cytosine deaminase, and fas ligand, may all be employed. In 

15 another example, for the treatment of cardiovascular or ischemic vascular disease, 
genes encoding angiogenic factors such as VEGF basic or acidic FGF's (FGF 1-5) 

m5»\/ Ko omi^l(^\/orl In \/ot onrifhor ovamnio in frKo troainnanf /^-f x/faor^cnoom fWck rte^rya 

encoding NO synthase or heme oxygenase, may be employed. In a further example, 
monocytes and dendritic cells may be targeted with genes encoding immunogens for 

20 cell-targeted immunization. 

In one embodiment, the methods of targeting gene transfer vectors to certain 
cell types involve administering to a cell population ex vivo, a construct of the present 
invention and introducing the transfected cells into a subject. In an alternative 
embodiment, the methods of the present invention comprise administering to an in 

25 vivo cell population a construct of the present invention. Administration can be by any 
of the routes normally used for in vivo gene therapy such as direct delivery to cells 
via a gene gun, and other known techniques. The constructs are thus administered 
in any suitable manner, preferably with pharmaceutically acceptable carriers. The 
constructs can be administered, for example, by intravenous infusion, orally, topically. 

30 intraperitoneally. intravesically or intrathecally. The preferred method of 
administration will often be intravenous. 

To practice an ex vivo method of the present invention, a source of cells is 
obtained. The cells are optionally selected from in vitro cells, such as those derived 
from cell culture and ex vivo cells, such as those derived from a subject. The term 

35 "subject" is intended to include living organisms, e.g., mammals. Examples of 
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subjects include humans, primates, dogs, cats, mice, rats, and transgenic species 
thereof. It will be appreciated that specific cell populations may be obtained by 
isolation from certain tissues by methods known to those skilled in the art. The cells 
are maintained under conditions necessary to support growth, for example an 
5 appropriate temperature (e.g.. 37**C) and atmosphere (e.g., air plus 5% COj). 

The cells are then transfected with the constructs of the present invention by 
introducing the constructs to the cell population, under conditions favorable for 
transfection. According to one embodiment of the present invention, cells are treated 
with compounds that facilitate uptake of the constructs by the cells. According to 

10 another embodiment of the present invention, cells are treated with compounds that 
stimulate cell division and facilitate uptake of the constructs. It will be appreciated 
that compounds that facilitate uptake of constructs by cells and compounds that 
stimulate cell division are known to those skilled in the art. 

The constructs of the present invention express the transferred gene in a dose 

15 dependent manner. The specific dose to be administered to a patient will be 
determined by the efficacy of the particular construct and/or delivery system 
GmnlovPtri th<=* qene transferred, and the condition of the patient, as wg!! as the body 
weight or surface area of the patient to be treated. The size of the dose also will be 
determined by the existence, nature, and extent of any adverse side-effects that 

20 accompany the administration of a particular construct or effect a particular patient. 
In determining the effective amount of the construct or transfected cell to be 
administered, the physician needs to evaluate circulating plasma levels, toxicities, and 
progression of disease. It will be appreciated that administration can be accomplished 
via single or divided doses. 

25 There is a wide variety of suitable formulations for pharmaceutical 

compositions containing the constructs of the present invention. Formulations suitable 
for oral administration can consist of (a) liquid solutions, such as an effective amount 
of the construct dissolved in diluents, such as water, saline or PEG 400; (b) capsules, 
sachets or tablets, each containing a predetermined amount of the construct, as 

30 liquids, solids, granules or gelatin; (c) suspensions in an appropriate liquid; and (d) 
suitable emulsions. The construct, alone or in combination with other suitable 
components, may also be made into aerosol formulations to be administered via 
inhalation, e.g., to the bronchial passageways. Aerosol formulations can be placed 
into pressurized acceptable propellants, such as dichlorodifluoromethane, propane. 

35 nitrogen, and the like. 
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Suitable formulations for rectal administration include, for example, 
suppositories, which consist of the construct with a suppository base. Suitable 
suppository bases include natural or synthetic triglycerides or paraffin hydrocarbons. 
In addition, it is also possible to use gelatin rectal capsules which consist of a 
5 combination of the construct with a base, including, for example, liquid triglycerides, 
polyethylene glycols, and paraffin hydrocarbons. 

Formulations suitable for parenteral administration, such as, for example, by 
intraarticular (in the joints), intravenous, intramuscular, intradermal, intraperitoneal, 
and subcutaneous routes, include aqueous and non-aqueous, isotonic sterile injection 

10 solutions, which contain antioxidants, buffers, bacteriostats. and solutes that render 
the formulation isotonic with the blood of the intended recipient, and aqueous and 
non-aqueous sterile suspensions that can include suspending agents, solubilizers, 
thickening agents, stabilizers, and preservatives. The formulations can be presented 
in unit-dose or multi-dose sealed containers, such as ampules or vials. 

15 Extemporaneous injection solutions and suspensions can be prepared from sterile 
powders, granules, and tablets of the kind previously described. Cells transfected by 
the constructs as described above in the context of ex vivo therapy can also be 
administered as described above. 

This invention also provides compositions and kits comprising the constructs 

20 of the present invention. For example, the composition can comprise the constructs 
of the present invention in a pharmaceutically acceptable carrier as described above. 
Kits comprising such compositions and instructions for use are also within the scope 
of this invention. 

In order to more fully demonstrate the advantages arising from the present 
25 invention, the following examples are set forth. It is to be understood that the 
following is by way of example only and is not intended as a limitation on the scope 
of the invention. 

SPECIFIC EXAMPLE 1 

I. Methods 

30 Recombinant retroviruses were produced by transient transfection of 293T 

cells: 2 x 10® cells were plated 24 hours before transfection in 60 mm dishes. 
Transfection was performed by calcium-phosphate precipitation using 3 //g of a 
retroviral vector (Kinsella, T.M. et a!., Hum, Gene Ther. 7:1405 (1996)) encoding 
lucerifase linked to an internal ribosome entry site and a green fluorescent protein 

35 derivative (GFP; pEGFP. Clontech), pLZRs-Luc-Gfp. 5 //g of an expression vector 
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encoding gag and pol, pNGVL-MLVgag-pol, and 1 //g of the envelope encoding 
plasnnid: pNG\/L-4070A (ampho) env, pCMV-Eco env or p1012-Ebola GP, 
respectively. Supernatants corresponding to 24-48 hours post-transfection were 
harvested, cleared by low-speed centrifugation and either used immediately for 
5 infection or frozen at -80°C. Infections were performed in 6-well plates (1.5-2.5 x 10^ 
adherent cells) or 12-well plates (5x10^ non-adherent) using different dilutions of the 
supernatants by incubating the cells overnight with 1 ml and 300 //I. respectively of 
the diluted supernatants. Polybrene was used at a concentration of 5 //g/ml for all the 
cell lines except for D17 in which the concentration was 100 /yg/ml. After overnight 

10 infection, fresh medium was added and the cells were incubated for an additional 24 
hours. After infection, the cells were lysed in 25 mM Tris-phosphate pH 8, 2 mM 
DTT, 2 mM 1 ,2-diaminocyclohene-N,N,N\N'-tetraacetic acid, 10% glycerol, 1% 
TritonX-100, and assayed for luciferase activity using Luciferase Assay Reagent 
(Promega, Madison. Wl) in a 1251 BioOrbit Luminometer. The same number of cells 

15 (range 5-10 x 10"*) was analyzed for every specific cell line. 

Binding of sGP to neutrophils and inverse correlation of binding with 

J.: M.t f— MA MA^ nkDhA/^ t^^^ .u^4-»^ ...uu 

ai^uvaUKJU. i~iyuiK:^o ir^-^tr^^,. r- ljiviv^ iiwiii ii\^iiiicii vuiljiilocio vvcic ii ioul/ckou wilii 

control or sGP supernatants derived from transfected 293 cells, and immunostaining 
was performed using a rabbit antibody to sGP as previously described. Sanchez, A. 
20 et al., PNAS (USA) 93:3602 (1996); Xu, L. et al., Nat Med. (1997) in press. 
Secondary staining was performed with a fluorescein isothiocyanate (FITC)- 
conjugated goat anti-rabbit IgG antibody (Sigma, F9887). All incubations were 
performed at 4^*0 for 30 minutes with .4 fjg of the relevant antibodies per 10^ cells in 
a 50 volume. 

25 Figures 16-181. Double immunostaining with antibodies to sGP and the 

neutrophil-specific marker, CD15. Cells were incubated with a FITC conjugated 
mouse anti-human GDI 5 antibody (Caltag, cat# MHCD1501), followed by secondary 
staining with a PE-conjugated anti-rabbit IgG antibody (Sigma) to detect sGP binding. 
Cells were washed with PBS, fixed in 1% formaldehyde, and analyzed by FACS. 

30 Figure 1C. Specific absorption of sGP by neutrophils. Control or sGP 

supernatants derived from relevant transfected 293 cells (Xu, L. et al., Nat Med. 
(1 997) in press) were incubated at 1 :500 dilution with 1 0® mononuclear or granulocytic 
cells. Cells were removed and the resulting supematants analyzed by an 8% SDS 
PAGE gel. Western blot analysis was performed as previously described (Xu, L. et 

35 al., Nat Med. (1997) in press) using an anti-GP rabbit antisera and a secondary 
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antibody, horseradish peroxidase conjugated donkey anti-rabbit IgG at a dilution of 
1 :5.000 (Amershann, NA934). Primary antibody was incubated for 30 minutes at room 
temperature, as was the secondary antibody. The immunocomplexes were detected 
by chemiluminescence using Supersignal® chemiluminescent substrate reagents 
5 (Pierce) according to the manufacturer's instructions. Arrow indicates sGP reactive 
band. 

Infection of different cell types by GP-pseudotyped retroviral vector and 
preferential binding to endothelial cells: Figure 2A. Infection of different indicator 
cell lines with the Ebola-GP pseudotyped retrovirus expressing luciferase. 

10 Amphotropic and ecotropic retroviral vectors were used as controls. Viruses were 
diluted to different multiplicities of infection (MO!) to provide for equal luciferase 
activity on Hela cen/ical epithelial cells, permissive for amphotropic retrovirus, D17 
dog osteosarcoma cells (Embretson, J.E. et a!., J, Virol. 61:3454 (1987)), which are 
permissive for amphotropic, xenotropic, and ecotropic retroviruses, and BW5147 T 

15 leukemia cells permissive for amphotropic and ecotropic virus. In these groups, GP 
virus titer was 1-4x10®/ml and amphotropic virus was -*2x10Vml (MOTs-LO and 0.1, 
respectively), and the ecotropic virus titer was — 10^'m! (MO! — 10). Titers Vvere 
determined by endpoint dilution of reporter activity of the amphotropic virus in D17 
cells, and was normalized to reverse transcriptase activity for the GP virus. 

20 Figure 2B. Analysis of different normal or transformed cell lines by infection 

with amphotropic or GP retroviral vectors at the same titer (1 OVml, MOI 0.2). Forty- 
eight hours after infection, an equivalent of 5 x 10"* cells was assayed for luciferase 
activity after exposure to equal titers of viral stocks. Luminescence is expressed as 
the fold-increase over non-infected control cells. 

25 Figures 2C-2C3. The binding of sGP (left) or GP-pseudotyped retrovims 

(right) to neutrophils (upper panel) or microvascular endothelium (lower panel) was 
determined by FACS. sGP binding was performed as in Fig. 1A, and retrovirus 
incubation was performed at 37*^0 for 2 hours in the presence of polybrene (8 //g/ml). 
Figure 2D, Infection of D17 ceils by GP-pseudotyped virus in the absence 

30 (lane 1, none) or presence of control (lane 2) or sGP supernatant (lane 3) from 
transfected 293 cells. Gene transfer was measured by the luciferase assay as 
described below. Luminescence refers to relative light units in the luciferase assay. 

Depend nee of sGP binding n CD16b and correlation of binding with 
neutrophil activation: Figures 3A'3D. Neutrophils were incubated for 30 minutes 

35 at 4**C with a mouse antibody to GDI 6b (upper panel; clone 3G8 from Immunotech, 
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cat# 1M0813) or CD62L (middle panel, R&D Systems), compared to the indicated 
control antibody [purified mouse IgG (Vector Laboratories). 1-2000], followed by 
supernatants from control or sGP-transfected 293 cells, primary rabbit antibody to 
sGP, and a FITC-conjugated secondary antibody to rabbit IgG (Fig. 1 , legend). Cells 
5 were washed with PBS, fixed in 1% formaldehyde, and analyzed by FACS. For 
blocking, 10® cells were incubated with 0.5 - 1 A/g of the relevant antibodies for 30 
minutes in a 50 //I volume. 

Figures 3E'3F. Immunostaining with sGP was performed on isolated 
neutrophils which were maintained in media (none) or incubated with PMA (10 ng/ml) 

10 at Sy^'C for 30 minutes (PMA). 

Effect of sGP on neutrophil function: Figures 4A-4B. Exposure of 
neutrophils to sGP inhibits down modulation of L-selectin. Isolated neutrophils were 
incubated with the indicated control or sGP containing supernatants (Xu, L. etal., Nat 
Med, (1997) in press) and defined media (AIM V. GIBCO) for 4 hours at 37**C. 

15 Expression of L-selectin was determined using an anti-CD62L antibody (R&D 
Systems), followed by the secondary staining using a FITC-conjugated anti-mouse 

InC^ ^Qinmo P9RR'^^ ac rlocnrihoH in Pin 1 lononr* r^ollc \Af£ir^ **'2E*^Sd *»'»*h PBS 

fixed with 1% formaldehyde and analyzed by FACS for relative levels of fluorescence 
intensity as a function of cell number. An isotype control was used to quantitate 
20 background levels of immunostaining (neg.). Results are representative of three 
independent experiments. 
IL Results 

To determine the specificity of Ebola virus glycoproteins, expression vectors 
encoding either sGP, GP, or a plasmid control (Xu, L. et al.. Nat. Med. (1997) in 

25 press) were transfected into 293 cells, and cell culture supernatants were used as a 
source of relevant recombinant glycoproteins. Binding of sGP was determined by 
immunofluorescence analysis after incubation of relevant supernatants with normal 
or transformed human cell lines. No binding was detected to several hematopoietic 
lineages, including lymphocytes or monocytes (Fig. 1A), or transformed JurkatorCEM 

30 T leukemias, the HL60 myelomonocytic or U937 promonocytic leukemia cells. In 
contrast, sGP was able to bind to granulocytic cells, as evidenced by FACS analysis 
of this subset of peripheral blood mononuclear cells (PBMC) discriminated by cell size 
and granularity (Fig. 1A). This cell specificity was confirmed by using double-staining 
with a granulocyte-specific cell surface marker. CD15 (Fig, IB). Absorption of sGP 
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by purified neutrophils in the absence of antibodies also resulted in depletion of sGP. 
indicating that binding to the neutrophil occurred in the absence of antibody (Fig. 1C). 

A potential structural similarity between Ebola GP and avian sarcoma virus 
envelope protein has been previously proposed (Gallaher, W.R., Cell 85:477 (1996)), 
5 raising the possibility that this protein could be incorporated into retroviral particles. 
To determine the binding specificity of the transmembrane glycoprotein, pseudotyping 
of a Moloney leukemia virus was therefore attempted. Infectivity of different cell types 
by this pseudotyped vector was determined with a luciferase reporter gene, de Wet, 
J.R. et al., Mol. Cell, Biol. 7:725 (1987). This analysis revealed infection of cells 

10 different from those which interacted with sGP (Fig. 2A,B). For example, though it 
could infect other cell types, transduction by the GP retroviral vector readily occurred 
in endothelial cells, either from the microvasculature (MVEC) or umbilical veins 
(HUVEC) (Fig. 2B). which did not bind sGP (Fig. 2C, left). When the specificity of 
GP-retrovirus was compared to murine retroviruses pseudotyped with amphotropic or 

15 ecotropic envelope gp70 proteins, the range of susceptible target cells differed (Fig. 
2B), suggesting that the virus receptor(s) for Ebola GP differ from those previously 
described forgp70, Minim.al binding of GP-virus was observed on neutrophils, despite 
the ability of these cells to bind sGP (Fig. 2C, upper panel) and the fact that 
immunoreactive protein was detected on the virus. Conversely, GP-virus binding to 

20 endothelial cells was readily detected, though these cells did not bind sGP (Fig. 2C, 
lower panel). When sGP was analyzed for its effect on GP retroviral gene transfer, 
infection was not inhibited by sGP (Fig. 2D), further confirming the divergent 
specificities of the two forms of the viral glycoprotein. Recent studies have revealed 
that the biochemical forms of these proteins differ, with sGP present in solution 

25 primarily as a homodimer and GP as a trimer, suggesting that differences in multimer 
composition may contribute to these alternative specificities. 

Potential cell surface receptors for sGP were analyzed with antibodies to 
several neutrophil cell surface antigens to interfere with sGP binding, including GDI 5. 
L-selectin (CD62L), GDI 6b, and several common leukocyte antigens. Only the 

30 neutrophil-specific form of the low affinity F^. preceptor III, CD16b, inhibited sGP 
binding specifically. Antibodies to CD62L, for example, did not inhibit sGP binding 
(Fig. 3). Binding to neutrophils correlated with their activation state and GDI 6b 
expression since no binding was observed in cells stimulated with phorbol 12- 
myristate 13-acetate (PMA) for 30 minutes, at which time GDI 6b expression was 

35 markedly decreased on these cells (Fig. 3, lower panel). Overexpression of this form 
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of CD16 on a heterologous cell type, 3T3 fibroblasts, did not confer sGP binding to 
these cells by FACS analysis, suggesting that CD16b is necessary but not sufficient 
for stable binding. 

Binding of sGP did not inhibit neutrophil activation in response to potent 
5 pleiotropic activators (PMA, IL-8. orf-Met-Leu-Phe), as measured by dov^n modulation 
of L-selectin expression using FACS analysis. In a defined serum-free medium, 
partial activation of neutrophils was observed, with a decrease in L-selectin 
expression at 4 hours (Fig. 4). Under these conditions, incubation of neutrophils with 
sGP supernatant prevented this decrease in L-selectin expression (Fig. 4). Because 

10 L-selectin was not required for sGP binding (Fig. 3), this effect was apparently 
indirect, through a mechanism not yet defined, possibly involving CD16b or 
carbohydrate interactions of the highly glycosylated sGP protein. 

The expression of alternative Ebola virus glycoproteins in clinical infection has 
long been recognized, but their functional roles and cell specificity have not been 

15 defined. Early after infection, high levels of the secreted protein are found in the 
serum and precede fulminant replication and dissemination of virus systemically, at 
which time synthesis of transmembrane GP is m.arkedly increased. Sanchez, A. et 
al., PNAS (USA) 93:3602 (1996). The inventors have now found that the binding 
specificities of these two molecules differ. It had been proposed that sGP may serve 

20 as a decoy to prevent recognition of GP, possibly to temporarily inhibit virus binding 
to target cells. The studies set forth herein suggest that this hypothesis is unlikely to 
be correct. The binding specificities of these proteins differ, and despite the fact that 
they are derived from the same viral gene, it has been surprisingly found that 
alternative forms of the glycoprotein have been selected for different functions. 

25 Although these proteins share identical amino terminal sequences, their 

carboxyl terminal regions differ. Sanchez. A. et al.. Virus Res. 29:215 (1993). These 
sequences are likely responsible for the differences in binding specificity, either 
through direct interactions in these domains or by their effect on multimerization. The 
secreted glycoprotein binds to neutrophils to prevent early events in activation. 

30 possibly serving to diminish any inflammatory responses which might provide innate 
immunity to the virus, facilitating productive viral replication. The subsequent increase 
in GP synthesis gives rise to virus which in turn could infect other cells. Filoviruses 
have been shown previously to infect and replicate in different cell types and appear 
to grow readily in endothelial cells in vivo, Peters. C.J. et al.. Fields Virology, B.N. 

35 Fields, D.M. Knipe and P.M. Howley, Eds. (Lippincott-Raven, Philadelphia) (1996); 
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Schnittler, H.J. et al., J. Clin. Invest. 91:1301 (1993). The findings set forth herein 
suggest that its tropism for this cell type is probably determined by the specificity of 
GP. In Ebola infection, preferential binding and infection of microvascular endothelial 
cells may lead ultimately to a loss of capillary integrity that results in the severe 
5 hemorrhage observed in the terminal stages of this disease. The differential binding 
of these two gene products from the same viral structural gene generated by RNA 
editing suggests that they have evolved functionally to differentially affect immunity 
and infectivity. The ability to facilitate viral replication and target the virus to 
endothelial cells by alternative products of the same viral gene represents an efficient 
1 0 genetic mechanism which can account for different pathologic features of this disease. 
Inhibition of sGP binding to neutrophils and GP to endothelium is likely to ameliorate 
the effects of acute Ebola virus infection. 

SPECIFIC EXAMPLE 2 

I. Methods 

15 Production of pseudotyped l\AuLV retroviruses expressing green 

fluorescent protein (GFP): 50% - 70% confluent 293 T cells in 60mm tissue culture 
dishes were Iransfected using the calcium phosphate method and the following 
plasmids: 0.3 //g 1012 GP{Z) (see Figure 8) or 1012 sGp-Gp(Z) (see Figure 9). 3pg 
LZR-gfp. 2 pg pNGVL-gag-pol. After overnight transfection, fresh media was added 

20 to cells. Twenty hours later, the supernatants were harvested and filtered through 
a .45 //m filter. 

Infection of HUVEC cells using the pseudotyped retroviruses: The day 
before infection. 30% - 50% confluent HUVEC cells were prepared in 6-well plates. 
1 ml of pseudotyped retroviral supernatant was added to one well of the 6-well plates 
25 with 15 pglm\ of polybrene. Sixteen hours later, the viruses were removed and 
normal media was added. After 24 hours, the cells were lifted and GFP expression 
measured using FACS analysis. 

Construction of 1012 sGP'GP(Z): 1012 sGP(Z) cells were digested with PstI 
and treated with Klenow, then digested with Xbal. 1012 GP(Z) cells were digested 
30 by EcoRI and treated with Klenow. then digested with KpnI. Pstl/Klenow/Xbal treated 
sGP fragment and EcoRI/Klenow/Kpnl treated GP fragment were then cloned into 
Xbal/Kpnl treated pVR-1012 plasmid. 

GP and sGP derivatives: The receptor recognition domain, mucin-like 
domain and/or TM domain of GP and sGP were mutated. The mutated GP and sGP 
35 was then tested for its ability to pseudotype and for cytotoxicity in producer cells. 
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II. Results 

To determine the efficacy of targeting endothelium with the gene transfer 
vectors pseudotyped with GP of the present invention, HUVEC cells were infected 
with GP(Z) pseudotyped MuLV retrovirus (LZR-gfp) and sGP-GP(Z) pseudotyped 
5 MuLV retrovirus (LZR-gfp). Figures 5A-5C show the infection rate (GFP expression) 
measured using FACS analysis. As shown in Figure 5B, the GP(Z) pseudotyped 
MuLV retrovirus (LZR-gfp) was effective in targeting and expressing GFP in 
endothelium. 

To determine whether mutating GP would effect its ability to pseudotype and/or 
10 decrease toxicity in producer cells, the receptor recognition domain, mucin-like 
domain and/or TM domain were mutated. Figure 10 shows the results. The optimal 
envelope is able to pseudotype but shows minimal toxicity. 

The foregoing discussion discloses and describes merely exemplary 
embodiments of the present invention. One skilled in the art will readily recognize 
15 from such discussion, and from the accompanying drawings and claims, that various 
changes, modifications and variations can be made therein without departing from the 
spirit and scope of the invention as defined in the following claims. 

All patents and other references cited herein are incorporated by reference as 
if fully set forth. 
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WE CLAIM: 

I. A genetic construct connprising a gene operatively-linked to a carrier, 
wherein the carrier is associated with a transmembrane form of viral glycoprotein or 
derivative thereof. 

5 2. The genetic construct of Claim 1 , wherein the transmembrane form of 

viral glycoprotein or derivative thereof is expressed on the surface of the carrier. 

3. The genetic construct of Claim 1 , wherein the transmembrane form of 
viral glycoprotein or derivative thereof is from Ebola. 

4. The genetic construct of Claim 1, wherein the carrier is a viral vector. 

10 5. The genetic construct of Claim 1 , wherein the carrier is a non-biologic 

gene targeting vehicle. 

6. The genetic construct of Ciaim 4, wherein the virai vector is a retrcvira! 

vector. 

7. The genetic construct of Claim 4, wherein the viral vector is a lentiviral 

1 5 vector. 

8. The genetic construct of Claim 5, wherein the non-biologic gene 
targeting vehicle is a liposome. 

9. The genetic construct of Claim 5, wherein the non-biologic gene 
targeting vehicle is a DNA-protein complex. 

20 10. A method of targeting a gene to a cell comprising the step of 

administering to a cell population a genetic construct comprising the gene operatively- 
linked to a carrier, wherein the carrier is associated with a transmembrane form of 
viral glycoprotein or derivatives thereof. 

II. The method of Claim 10, wherein the transmembrane fomn of viral 
25 glycoprotein or derivative thereof is expressed on the surface of the carrier. 
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12. The method of Claim 10, wherein the transmembrane form of viral 
glycoprotein or derivative thereof is from Ebola. 

13. The method of Claim 10, wherein the carrier is a viral vector. 

14. The method of Claim 10, wherein the step of administration is ex vivo, 
5 15. The method of Claim 10, wherein the step of administration is in vivo. 

16. The method of Claim 10, wherein the cell is an endothelial cell. 

17. The method of Claim 10, wherein the cell is a hepatocyte. 

18. The method of Claim 10, wherein the cell is a monocyte. 

19. The method of Claim 10. wherein the cell is a dendritic cell. 

10 20. The method of Claim 14, further comprising the step of introducing the 

cell population to a subject. 
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SEQUENCE LISTING ID NO: 1 

pvR xoi3-o?(:c) 
Generaf Descriptton 

Z22:A pVR 1012 -GPdC) 
Local object 

Created: 09/14/98 04:17PK 
Last Hodification Dates ? (co data) 
IcngtlK 7003 hp 
storage type: Sacic 
forxi] Circular 
Comenta 

Restriction Map 
BgllUlsite 

ClalMsite ^J^l 
Drain:! site §^SSc 
EcoRVMsrte gS^-g 
Hindlltlslte AAG^ 
HpalMslte gJigAC 
Kaslrlsitc gggggg 

ivpnu I sue cCRTCC 

Nark 1 Site gg|ggg 

Pmllrlsite g^g 

Pstlilsite l^^i 

Pvutilsite gg^ 

Sacll: 1 srte ggg^ 

Sail: 1 site g[g§$g 

XmnI* 1 site GAAKKKNTTC 

EcoRI:2sites g*$3™g 
NcoJ:2sitcs ^3[g« 
Ndel:2sites |^ 
Sphl:2sltes 
XhoI:2sftes g^$g 
BafnHI:3sites 
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Belt: 3 Sites 

Functional Map 

CDS (4 signals} 

CMV IE UT 

Starr; Be$ £ndt 1129 

CMV IE INT 

Startt 1130 End* 1840 

TbGH 

Surt< 4020 Sndt 4572 

Kanr 

start £ 60^8 Sndt €€90 (Cotnpleaentary) 

Miscjfeature (2 signals) 
CMV enhancer 

stare: 248 EzkSs &S5 
GP(IC) 

start: 1870 Cndi 4019 

Annotations 
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1 ^GCGCGTTT CCCTCilTGAC GCTCXXAXCC TCTGACACAT CCACCTCOCO 
j^GCGCGCAJJ^ CCCACTACTC CCACTTTTGO A6JS£TCTC2A CCTCGACCCC 



51 GA«CCOTCX CAC CI TCT C T CTAAGCCWT CCCCOOAGCA OACAMCCCG 
CTCTSCCACT CTCCAACAGA CATTCCOCTA COOCCCTCCT CTCTTCGGGC 



101 TOCSaCCCG TCACCCGQTG TTGGCGCGTC TCGCGGCTGG CTTAACTATC 

ju:rcccGccc j^ctcgcccac aaccgcccxc agccccgacc caxttoatac 



KdeX 

151 CCCCASCAGA CCACXTTCTX CTGXGACTCC XCCATXTOCO CTCTCXXXTA 
CCCGTAKCr CCTCTAACAT CACTCCCACO TGCTATACCC CACACTTTAT 



201 CC6CACAGAT CCCTAAOa\G AAAATACCGC ATCACATTCC CTAWCGCCA 
GGC G T C S CTA CGC A TTCCTC TTrTATGGCG TACTCTAACC GATAACCCCT 



251 TTGCA7ACGT TCTATCCATA TCATAATATC ffACATT?ATA TTCCCTCAM 
AACGTATGCA ACATAGQTAT AGTATTATAC ATGTAAXTAT AACCGAOTAC 



301 TCCAACAOTA CCGCCXTGT2 GACATTGAW ATTGACTAG? TATTAATACT 
AGG7TGTAAT CGCGGTACAA CCCTAACTAA TAACrGATCA ATAATTATCA 



3«* AACCAATTAC CGGGTCATTA CTTCATAGCC CATATAWCA GTrCCGCGTT 
TTAC7TJLATG CCCCXCTAXT CAAGTATCCG CTATATACCT CAAGGCCCAA 



401 ACATAACTTA CGGTAAATGC CCCCCCTGGC TGACCCCCCA ACGACCCCCC 
^ATTGAAT GCCArTTACC GGGCGGACCC ACTGGCGGGC TGCTCCCCCC 



451 CCCATTGACC TCAATAATCA CCTATGTTCC CATACTAACG CCAATACCCA 
GGCTAXCMC AGtTATXACT GCATACAAGG CTATCATTGC GGTTATCCCtT 



501 CTTXCAYTG ACGTCAATCC GTGGAGTATT TACCGTAAAC TGCCCACTTG 
CAAAGGTAAC TCCACrTACC CACCTCATAA ATCCCATTTC ACCGGTGAAC 



551 CCAGTACATC AACTGTATCA TXrCCCAAGT ACGCCCCCTA TTGACGTCAA 
CGTCArCTAO TICACATAGT A7ACGGTTCA TCCCCGGGAT AACTCCACTT 



«Q1 'TGACGGTAAA TGGCCCGCCT CCCATTA"rGC CCACTACA7G ACCTTATCCC 
ACTGCCXTTT ACCGGCCCGA CCGTAATACC CCTCATGTAC TCGAATACCC 



Mcx>I 



«fit actttcctac ttoocactac atctacotat txgtcatcoc tattaccxtc 
maSgcato aaccotcatg tacatccata atcagtagcc ataatggtac 



KOOZ 



701 CMMOCCCT CTTGOCAGTA CXTCAXTOGO CGTGGJTACC «TTTGACTC 
CACTXCCCCX AAJUCCC7CAT CTACTTACCC CCACCTATCG CCAAACTCfcC 



7S1 ACCCCCATTr CCAACTCTCC ACCCCATTGA CCTCXATCGG AGTTWTTTT 
StcCCCTAAA CCWCAGAOQ TOOCCTAACT CCACTTACCC !rCAAACAAAA 



801 COCADCAAAA CCAACGOGAC TTTC CAXAA T CTCGTAACA^ CTCCCCCCCA 
CCCTCCTTT7 AOTTOCCCTC AAACGTraTA CACCATTCTT GAGGCGGGGT 
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• 51 nCACGCMA. TCCGCOGTAG CCCTGXACCG 7GGGAGG7CT ATMAACCAO 
A&CTGCGTTT ACCCGCCATC CCCACATGCC ACCCTCCAGA TAXATTCOTC 



SOI ACCTCSTTTA CTGAACCQTC AGATCGCCTC GACACaCCA7 CCACGCTCTT 
TCOASCAAAT CAC7TGQCAG 7C7AGCGGXC CTCTGCGGTA GGTGCGACAA 



SacZX 



951 TTGACCTCCA TACAACACAC COGGACCCAT CCA GO CT CC G CGGCCGGGAA 
AACtOGAGGT ATCTTCTOTO GCCCTGGCTA GG7CGGAGGC CCCGGCCCTT 



1001 CCCTGCATTG OAACGCGOAT TCCCCC7CCC AAGAGTGACG TAACTACCGC 
CCCACGXAAC CTTGCGCCTA AGGGCCACGG TTC7CACTCC ATTCATGGCG 



SphZ 



1051 CTATAGACTC TATAGOCACA CCCCTTTGGC 7CTTA?CCAT GCTATACTCT 
GATA7CTGAG ATATCCGTGT GGGGAKACCG AGAATACGTA CGATATGACA 



1X01 rrrrcccTTC ccccctatac accccccctt ccttatgcta tacgtgatog 

AAAACCGAAC CCCGGATATG TGGGGGCGAA GGAATACGA7 ATCCACTACC 



1151 TATAGCtTAG CCTATAGGTG I'GCGTTATTG ACCATrATTG ACCACTCCCC 
7.TA7CGAATC GGA7ATCCAC ACCCAATAAC TCCTAATAAC TCGTCACCGG 



1201 TATrTGGTGAC CATACTTTCC ArTACTAATC CATAACATCG C?CTTTGCCX 
ATAACCACTG CrATGAAACG TAArCATTAG CTATT-GTACC GACAAACCGT 



1251 CAACTATCTC TATTCCCTAT ATGCCAATAC TCTCTCCPTC AGACACTCJU: 
CrrCATAGAG ATAACCGATA TACGOTTATG AGACAGGAAC 7CTCTGACr?G 



1301 ACGGWrrCTG TATTTTTACA GGATGGG3TC CCATTTATTA TTTACAAATT 
rCCCTGASAC *,TAAAAA?^ CCTXCCCCAG GGTAAATAAT AAATGTTTAA 



1351 CACATATACA ACAACOCCGT CCCCCCTCCC CGCAGTTTrT AKAAACATA 
G7CTATATGT TGTTCCGGCA GGGGGCACGG GCGTCAXAAA TAATTTCTAT 



1401 GCCTCCGATC TCCACCCGAA TCTCCGCTAC GTGTTCCOCA CATGGGCTCT 
CCCACCCTAG AGGTGCGCTT AGAGCCCATO CACAACGCCT G7ACCCCACA 



1451 TCTCCGGTAG CCGCGGAGCT TCCACATCCO ACCCCTGGTC CCATGCCTCC 
AGAGGCCATC GCCGCCTCCA AGCTGTACCC TCGGGACCAC CGTACGGAGG 



1501 AGCCCCTCAT CGTCCCTCCG CAGCTCCTPO CTCCTAACAC TGGAGQCCAO 
7CCCC6A8TA CCACCOAGCC GtCGXGSAAC GACGATTGtZC ACCTCCGCTC 



1S51 ACTTAOOCAC ACCACAATGC CCACCACCAC CAGTCTCCCG CACAAGOCCO 
TCAAICCGTG TCGTC7TACC CGTGGT G CT C CTCAC A COGC CTCTTCCGGC 



IfiOl TGGC6GTAGG CTATGTGTCT CAAAATGAGC CTGOACASTO GGCTCGCAC6 
ACCGCCATCC CATACACAGA CTTTTACTCG CACCTCTAAC CCGAGCGTCC 



1C51 CCTGACGCAG ATGGAXGACT TAAGGCA6C6 G CACAAQAX Q ATOCACCCAO 
CGAC7GCCTC TAOCTTCTCA ATTCCCTCGC CGTCTTCXTC 7ACGTCC6TC 



1701 RGAffTXCTT GTATTCTGAT AA6AGTCACA COTAACTCCC CTTGCGGtGC 
GACTCAACAA CATAACACCA TTCTCACTCT CCAT7CACGG CAACGCCACO 
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1751 


TGTTAKCGGT GGACCCCAGT CTACTCTGAG CACTACTCGT TGCTGCCCCG 
ISxTTGCCA CCTCCCGTCA CXTCXGACTC CTCATGXGCA ACCW:CGCOC 










1801 


CCCCCCACOl GACATAXTAO CTi^^Cti^ l^^^l^!^ IISS^I 
CCCCCCTGGT CTCTXTTATC CACTCTCTGA TTG7CTCACX AGGMAGCTA 






Sail 






KCOl Mtl ftnll Bell ECOKV 




1851 


^^GTCmrC TGCACTCACC GTCCTCGACA CGTCTGATCA CATATCGCGG 
CCCACXAAAG ACffTCACTGG CAfiCAGCTGT GCACACTACT C7ATAGCGCC 






EcoRI 




IS 01 


cFSfcrff'Trr cctctaqam tctctaatca ca5tca*t-*a "frrii-Ji-. 

CMGCCCCCC CCAGATCTTA ACJlOJlTrAGT CTCACTAGTA CCCTCCCaST 




19 SI 


rVMiTTCTGC AA^TGCCCCG iW^GGGwXiV AVt*AA«AW#%. a is^A a *to*^*.*^t§: 
^SSS TTlScCCcSc ACrCGCCAXG TCCTKTCTX CAAAQAMCX 




2001 


S^CCCXTTAT TAGGATAACa TATTTCACAA AAGTTAGCCC AACCCCCAAC 




2051 


m*fi.f^%fxx TXCCCTACAA GTGAOTOATA TCCACAAGTT TCTCIGCCOA 
S^MKCTT AMGGATCTT CACTCACTAT AACTGTTCAA ACACACGGCT 




2101 


^^"^^ SS!^:-^^ ^S^^i? JSJSLcT 


= . = . 


2151 


^ **Tv**:*a.r*rjLxe CC£AAjCCAAA AGA*rGCCGTT 

GGCCAATGGA CTACCAACTG ATGiACCAAC &^»ww,www» Sllllt^tXXI^t » 
CCCGTTACCT CATCGTTCAC TACATCCrTC CCGTTGGXCr? TCTACCCCAA 




2201 


rrCGAGCTCC TGTTCCACCA AAGCTGGTAA A^^i^Si: Irrrwricc 
A^GCTCCACC ACAAGCTGGT TTCCACCATT TAATCCMCG ACCTCTTACC 




2251 


CCTGAGAACT GTTATAACCT OOCTMAAAG fiJGTIWTB ^J^^iji^ 
CGACXTTCA CAATATTGGA CCGATATTTC TTTCAACTAC CATCACTCAC 




2301 


CCTACCAGAA CCCCCTGAGO CACTCAGGGA 7"*^°^ I??r^T»r 
OQATOCXCTT CGCOCACTCC CTCACTCCCT AAAAGGGGCA ACGGCGATAC 




2351 


. — ^/^^^'■[•rfti^ 

TACACAAACT CTCACGAACT GGACCATCCC CACGMeACr 
ATCTGTTTCA CAOTCCTXCA CCTOOTACGO CTCCTCCTOA CCCGAAAGTO 




3401 


^5g5?S? SSSSSS S^ISSSS SS?^ 




2451 


TCGGCCiACA Kcctrrccco JiAoejJ««2;« 'SSSSSS tISSSSS 

JVCCCCCMGT TCGAAACGOC TTCCTCAAXA ACGTAAACAC TAGAACGCAT 




2501 


AGOCGCCAAX CGATTTTTtC CACTCTCCTC OlTTOCATOA OMTMOttC 
TCCCCCCTTT CCTAAAAAAS GTCA£»CGAO GTAACGTACT CGGACOGTSO 
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2S51 


XTOACCACGC ATCCCTCCAG TTACTXTCAC ACGMAACAX TAXXRACOT 
TACTCCTuCC TAGOGACwTC AAXwiTAUUw a a a wwmwv 




2€nx 


GGTTGXTAAT TTTGGAACCA XCACCACAGA GTTTCTCTTC CAAGTCGATC 
• * ft to ^^iMi^^ftt ^&.x&m.o&&n fiT^Cift^f^^rt 






Xhol 




2SS1 


XTTTGACGTX TGTGCXGCTC CXCCCAXOXT TCACACCACX ATTCCTTOTC 
TAAXCTOCJIT ACACCTCGAC CTCCCTTCTA ACTCTCGTGT TAXGGAACAC 




270X 


CTCCTAAMC JUUICCATCTX CrCTCXTAAC COCACAXCTX ACXCAACXGO 
GAOOATTTAC T7TGGTACXT GAfiACTATTC CCCTCTTCAT 9GTCTTCTCC 




27S1 


AAAACTAATC TGGAAAXTAA ATCCCACTGT TCATACCAOC ArCOCTCXCT 
WTTGAKAG ACCTTTTATT TACCCTCACA ACTATGCTCC TACCCACTCA 




2801 


CCCCTTTCTG CCAAAATAAA AAAACTTCAC AAAAACCCTT TCAACTGAAO 
CCCGAAAiiAC CCTTTTATTT TTTTCAAGTG TTTITGCCAA AGT7CACTTC 




2851 


ACTTGTCTTT CCTACCTGTA CCAOAACCC AGAACCACiSt CCTTGACACG 
TCAACAGAAA CCATGGACAT CCTCTTTGGG TCTTOGTCCA GGAACTGTGC 




2901 


ACASCGACCC TCTCTCCTCC CATCTCCGCC CACAACCACC CAGGCGAAGA 
TSTCeCTCCC AGA6AGGACC GTAGAGGCGG C7CTTGG7GC GTCCGC11C7 




29£1 


CCACAAAGAA TTGCTtTCAG AGGATTCCAC TCCAGTGGTT CAGXTSCAAA 
C<j xV-VlX^TI ' AACCAAAGTC CCCTAAGGTG AGGTCACCAA GTCTACGTTT 




1001 


ACATCAACGG AAA6GXCACA ATGCCAACCA CAQTGACGGG TCTACCAACA 






Bell 




3051 


ACCACACCCT CrCCATTTCC AATCAATGCT CCCAACACTG ATCATACCAA 
TGGTCTCGGA GACGTXAAGG T7AGTTACCX GCGTTGTGAC TACTATOCTT 




3101 


XTCATTTATC CCCCTGGAGC GGCCCCAAQA AGACCACACC ACCACACAOC 
TAGTAAXXAG CCGGACC7CC CCGGGGTTCT TCTCGTGTCG TGGTC?GTCG 




3151 


CTCCCAAGAC CACCAGCCAA CCAACCAACA CCACACAATC GACGACACTX 
CACGGTTCM GTGGTCGGTT GGWGGTrCT CCTGTCTTAG CTCCTGTGAT 




3201 


AACCCAXCAT CAGAGCCCTC CAGTAGACCC AOaSQACCAT CCACCCCCAC 
nCGGTTGTA CTCTCGGGAG GTCATCTCCG TGCCCTGGTA GGTCGGGCXC 




3251 


GGTCCCCAAC ACCACAGAAX OCCACGCCGA ACTTCGCAAC ACAACCCCAA 
CCAGGGGTTG TCCTOTCTTT CQCTQCQQCT TOAACCCTTC TGTTGGGCTT 




3301 


CCACACTCCC AGAACAGCAC ACTCCCOCCA CTCCCATTO MGM^Sa 
GGTOTCACOG TCTTCTCGTO TGACCCCCGT CACOOTAACC TTCTCGCCAC 





3351 CXCCCCGACG jJ^CTCXGTGG ACCTGGCnC CTGXC GAACA CAATXCGGGG 
GTCGOGCTCC TT6AGTCACC TGQACCCAAC CAMGCTTGT CTTAMCCOC 
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BaaBZ 



GCrrOACAAAT CTCCTOACAG GATCCACJUIG JOAGCGMGO GATGTCXCTC 
CCACTCnTA GAGGXCTCTC CXAGOTCnC TTTC CC TT C C CTACACTCUO 



3451 CCXXTACACX ACCCAAATGC AXCCCAAACC TGCACTATTG GACXCCCTTG 
CC7TATGTOT rCOGTVUkCO TTGG OllXC O ACCTCXTAXC CTGTCOQMC 



3501 QAJGAflOQTQ CTOOCXT&OG TITAOCCTGG ATJlCCAXACT TCOOGCCAGC 
CTACTCCCAC GACGGTATCC AAATCCCXCC TATGGTATGA AGCCCGGTCG 



3551 AGCTGAGOQA XTTTACACTO AAGGCATAXT CCAOJkASCJUl AXTGGXnGA 
TCGACTCCCT TAAATCTGAC TTCCCtATTA CCTCTTACIT TTACCTAACT 



3(01 TCTGTGGAIT GAGGCAGCTG GCCAACCXAA CGACACAAGC TCrXCAATTC 
ACACACCTAA CTCCCTCGAC CCCTTCCTTT GCTGTGTTCG AGAACTTAAC 



3651 T7CTTAAGGC CAACTAC7GA CTTGCCTACA T7CTCTATAC TAAATCCGAA 
AAGJU3TCCC G77GAT&AC7 CAACGCATGT AAGAGATATG ATTTAGCCTT 



3701 AGCAATACAC TTCTT G CT C C AAAOATGGGG AGQAACATC7 CACATTCTAG 
TCSTTATCTG AAGAACCACG TTTCTACCCC TGCPICTACA CTGTAAGATC 



3751 CGCCTCACTG TTGCATTCAA CCCCAAGATT GGACCAXAA^ TATCACTGAT 
CCCGACTAAC AACCTAACTT GGGGTTCrAA CCTCaTTTTT ATACTGACTA 



Bell 



3801 AAAATTCAK AAATAATCCA TGACTTTCTC GATAATAATC TTCCAAATCA 
TTTTAACTAG TTTATTACCT ACTGXAACAG CTATTATTAC AAGGTTTAGT 



2S51 a»»TCA?GGC AGCAA^TfiCT CGACTGGATG GAMCAATGG GTTCCTGCTC 
CTTACTACCG TCCTTGACCA CCTGACCTAC C^TCTGTTACC CAAGGACCAC 



3901 CAATAGGAKT CACACGAGTA ATCATTCCTA TTATTGCTTT GCTCTCCATT 
CTTATCCTTA CTGTCCTCAT TACTAACGAT AATAACGAAA CCACACGTAA 



£coRI 



3951 TCOUUmxiA TGCTT7GAAC 7AATATAGCA 7CATACTTTA GAATTCTAGA 
ACGTTrAACT ACGAAACTXG ATOATATCGT AGTATCAAAT CTTAAGATCT 



Karl 



Xasi BatDBZ BglZZ 



4001 CCAGCCOCCT GGASCCAOAT CTCCTGTCCC TTCTACTTCC CAGCCA7CTO 
GGTCCGCGQA CCTJUKB7C7A OACOACACGC AAGATCAAOO GTCCGTAGAC 



4051 TTGTTTGCCC C7CCCOCGTO CCTXCCTTGA GCCTGGAAGG T0CCAC7CCC 
AACAAACGGG GJIOOOOGCAC CCAXSGAACT GGGACCTTCC ACCGTGAOOa 



4101 ACTG7CCTTT CCTAATAAAA TGACGAAATT CCATCGCATT ISTCTGAOTAO 
TGACASQiUkA GGATCXTTTT ACTCCTTTAA CGTAQCGTAA CAGACTCATC 



4151 QTGXKTTCt ATXCTGGGGQ GTGGGGTCGG GCA6CACAGC AAGGGGGACG 
CACACTAAOA TAAGACCCCC CXCCCC A CCC CGTCCTCT C C TTOOC C CTOC 
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KpnZ 



€201 KTTGSGMJUSK CAATMCAGO CAT0C7CCCG XTGCGG7GGC CTCTATGGGT 
TAACCCTTCT CTTATCGTCC CTACGACCCC TACCCCACCC GXGATACCCA 



4251 ACCCACCTGC TGAACXXTTG ACCCGGTTCC TCCTGGGCCA GAAXOUOCA 
TCGG7CCAC6 ACTTCTTAAC TCCC C CAAQG AGGACCCCGT CTTCCTTCCT 



4301 CCCACATCCC CTTCTCTOTG ACACACCCTO TCCACCCCCC T0G3TCTTAC 
CCGTCTAGGC GAAGAGACAC 7GTGTGGGAC ACCTCCCGGG ACCAAGAATC 



4351 TCCCAGCCCC ACTCATAGGA CACTCATAGC 7CA6GAGCGC TCCGCCT7CA 
AAGGTCCCCC TGAGTATCCt CTGAGXATCG ACTCCTCCCG AGGCGOAACT 



4^01 A7CCCACCCC CTAAAGZACT TCGAGCGGTC TCTCCCTCCC TCATCACCCC 
7AGGCTCGGC GATTTCATGA ACCTCGCCAG AGAGGGAGGG ACTAGTCOQO 



4451 ACCAAACCAA ACCXACCCTC CAAGAGTCCC AAGAAATTAA ACCAAGATA6 
TCCTTTGGTT TGCATCGGAG CT7CTCACCC TTCTTTAATO TCGTTCTATC 



4501 GCTATTAAOT GCACAGGGAG AGAAAAtCCC TCCAACATGt C&COA&GTAA 
CCATAATTCA CGTCTCCCTC TCTTTf ACGG AGGTTCTACA CTCCTTCATT 



4551 TGAGACAAAT CATA5AATTT CTTCCGCTTC CTCGCTCACT CACTCGCOGC 
ACTCrCTTTA CTATCTTAAA CAACGCGAAG CACCGASTGA CTGAGCCACC 



4C01 CCTCGGTCCT TCGGCTGCGG CCACCGGTAT CAGCTCACTC AAAG3CGGTA 
CGAGCCAGCA AGCCGACGCC CCTCGCCATA GTCGAGTGAG TTTCCGCCAT 



4e51 ATACGGTTA? CCACAGAATC ACGGGATAAC GCAGGAAAGA ACATGTGACC 
TATCCCAATA GGTGTCTTAC TCCCCTA7TC CCTCCTTTCT TGTACACTCG 



4701 AAAX6GCCAC CAAAAGGCCA CGAACCGTAA AAAGGCCGCG TTGCTGGCGT 
TTTTCCCGTC GmTCCOGT CCTTGGCATT TTOCCGGCCC AACGACCGCA 



4751 TTT^CCATAG GCTCCOCCCC CC7CACGAGC ATCACAAAAA TCGACGCTCA 
AAAAGGTATC CGAGGCGGGG CGACTGCTCG TAGTCTTTTT AOCTGCGACT 



4801 AGTCAGAGGT CCCGAAACCC GACAGGACTA TAAACATACC AGGCG777CC 
TCAGTCTCCA CCGCTTTGCC CTGICCTGAT ATTTCTATGG tCCCCAAACG 



4051 CCCTGGAAGC TCCCTCGTCC CCT C T C CTGT TCCGACCCTG CCOCTTACCO 
GGGACCTTCQ AGGGACCACO CQAGAGGUICA AOGCTGGGAC GGCGAATGGC 



4901 CATACC7G7C CCCCT7TCTC CCRCGOGAA GCGTGGCGCT TTCTCAA7GC 
CTATGGACA0 GCGGAAAGAO GOAAGCOCTT CQCAC C CCCA AAGAGrTACG 



4951 7CACGCTGTA GGIATCTCAG TOC G GTOTAC CTCOTTOGCT CCAAGCTOOG 
AfiTOCGACAT CCATAGACTC AAOCCACATC CAOCAAGCGA GGTTCGAOCC 



5001 CTGXCTGCAC GAACCCCCCG TTCAGCCCGA CCGCTGCCCC RATCCGGTA 
GACACACGTG CTTGOGOOGC AACTCGGGCT GGCGACGCGG AATAGGCCAT 
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5051 AC»TCCTCT TCAOTCCJOC CCOGTAAflAC X C CA C TTXTC OCCACTCGCX 
TCAXACCAGjl ACTCACGTTO OGCCATTCTO TOCTGAATAC CGOTOACCCT 



5101 GCASCCACTO GTAACAGGAT TA5CACACCO ACGTATGTXC GCQCrrOCTAC 
CCTCGGTGAC CXTTGTCCTA ATCGTCT C QC TCCXTACXTC CGCCACCXTC 



5151 A<V^< f ' i ' J\T]Cr AXCTGGTGGC CTAACTACOG CTACACTACA AGGACAGTAT 
tCTCAAGAAC TICACCACCC CArTGATCCC CATC7GATCT TCCICTCATA 



S201 TTCGt X TCTG CQCTCtCCTC AAOCCAGTTA CCTTCGOAAA AAGACTTCOT 
AACCATAGAC GCGACACGAC TTCGG2CAAT GGAAGCCTTT nCTCAAOCA 



5351 AfiCTCTTGAT CCGGCAAACA AACCACCGCT CGTAGCGGTG GTTTTTTTGT 
9CGAGAACTA CCC Cii ' J ' nWl ' TTGGTQGCGA CCATCCCCAC CAAAAAAACA 



5301 TTOCAAGCAC CACATTACGC GCACAAAAAA AGSATC7CAA CAAOATOCTT 
AACGTTCCTC CTCTAATGCG CCTCTTrrrT TCCTACAOTT CTTCTAGGAX 



5351 tgaccttto: tacccggtct gacgctcagt ggaacgaaaa ctcacgttaa 
acrzagaaaag atgccccaca ctocgagtca ccttcctttt oagtgcaaot 



5401 GCGATTTTGG TCATGACAK ATCAAAAACG ATCTTCACCT AGATCCTTTT 
CCCTAAAACC ACCACTCTAA 7ACTTTTTCC TAGAACTCCA TCTAGGAAAA 



5451 AAATTAAAAA TGAHGTTTTA AATCAATCTA AACTATATAT CAGTAAACTT 
TTIAATTrrT ACTTCAAAAT TTAGKAGAT TTCATATATA CTCATTTCAA 



5501 C S T C TGACAC TTACCAATCC TTAATCAGTG ACCCACCTAT CTCAGCGATC 
CCAGACMTC AATGCTTACG AATTAGTCAC TCCGTGGATA GACTCGCTAG 



5551 TCCCTATTTC CITCATCCAT ACTTGCCTGA CTCCGGGGGG GGGGGCCCCT 
ACAGAXAAAG CAACTAGisTA TCaaCGGACT CAGGCCCCCC QQ^QCQGCGK 



560X GAGGTCTCCC TCGTCAACAA GGTGTTGCTC ACTCA7ACCA CGCCTGAATC 
CTCCAGACGG ACCACTTCTT CCACAACGAC TGACTAtXGT CCGGACTTAC 



5651 CCCCCATCAT CCAGCCAQAA AGTCAGGGAG CCACCGTTGA TGAGAGCTTT 
CGGGGTAGTA GGTCCGTCTT TCACTCCCCC GGTGCCAAC7 ACTCTCGAAA 



5701 GTTGTACCTG GACCACTOGG TGATTTTGAA CTTTTGCTPr CCCACGGAAC 
CAACATCCAC CTGGTCAACC AC t AAAA CT T GAAAACGAAA CGGTGCCTTG 



5751 GCTCrCCGTT CTCCCGAAGA CCCCTCATCT OATCCTTCAA CTCAG CAAAA 
CCAGACGCAA CAOCCCTTCT ACGCACTAQA CTACGAAGTT GACTCCTTTT 



SeOl GT7CGATTTA TTCAACAAAG CCGCCGZCCC CTCAAOTCAO CGTAA^GCTC 
CIOCCTAAAT AACTTOTTTC CGCCCCACGG CAGTTCACTC GCAT7ACGAG 



5S51 TOOCAGTCTT ACAAOCAATT AACCAATrCl CATTAGAAAA ACTCAT C CAC 
ACCCTCACAA TGTTCOTTAA TIGCTTAACA CTAATCTTXT TGAGTAGCTC 



SSOl CAICAAATCA AACTGCAATT TATTCATIATC AGQATCATCA ATACXATATT 
GTACTTTACT TTGACCWAA ATAACTATAC TCCTAAIAGT TATCCTATAA 



5951 7T3GAAAAAG CCGTTTCTCT AATG;JIGGAG AAAACTCACC GAGGCAGTTC 
AAACTTTTTC GGCAAACACA CTACTTCCT C TrtTGAGTCG CTCC6TCAAG 
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cool CX«fi»TCG CAXOXTCCTO GTA70QGTCT OCGATTCCCX CTCCTC CAAC 

craocciACc ottctacgjlc cxtxgccma ccctaacoct ga c cacctpg 

4051 XTCAATACXX CCTXTTAATT TCCCCTCOTC AAAXXffXXCG 

TACrrAffCTT OOATAATTXX AGGGGAfiCAC KTTTATTCC AATACTTCAC 

HindXIl 



£101 AC3kAASCA0C ATOAGTGACG ACTCAATCCG GTGAGAATCG CAAAAGCTTA 
mV.A CTGQ TACTCACTCC TGACTTACOC CACTCTTACC GTTTTCOAAT 



«5X tlH^CJaV ' lLli TCCAGACTTC MCAACACCC CACCCA T TAC GCTCGTCATC 
ACCTAJOGAA AOOTCTCAAC AAGTTGTCCG GTCGGTAATO CCACCAGTAO 



C201 AAAATCACTC GCATCAACCA AACCGTTATT CATICOTGA? TGCCCCTGAG 
TIWAOTCAO CGTAGWCGT TTQGCAATAA CTAAGCACTA ACCCOGACTC 



PvtXl 



C251 CC3U»C6AAA TACGCGATCG CTCT^AAAAG GACAATTACA AACAGCAATC 
GC3CTGCTTT ATGCGCTACC GACAATTTTC CTGTTAATCT TTCTCCTTAG 



$301 GXATGCAACC OGCGCAGSAA CACTGCCAGC CCArCAACAA 5ATT;:CCACC 
C r XACCgTG G CCGCCTCCTT GTGACCCTCG CGTAGTTCTT ATAAAAGTCG 



€351 TGRA30CGA TATTCTTCTA ATACCTGGAA TGCTGTTrTC CCGGGGATCG 
ACKACTCCT ATAAGAASAT TATGGACCTT ACGACAAAAC GGCCCCTAGC 



640** CACTGCTGAG TAACCATGCA TCATCACCAG TACGGATAAA ATGCTTGATC 
* CrCACCACiC ATTGGTACGT ACTACTCCTC ATCCCTATTT TACCAACTAC 



$A%1 GTCCCaAGAG CCATAAATTC CCTCAGCCAG TTTACTCTGA CCATCTCATC 
CAGCCrXTC CGrATTXAAC ecAGaCGGXC AAATCAGACT GCTA_f?,^GTAO 



€501 TGTAACATCA TTOGCAACCC TACCTTTGCC ATGTTTCAGA AACAACTCTG 
ACATTCTAGT AACCCTOGCG ATGGAAACGG TACAAAGXCT TTGTTGAGAC 



Clal 

6551 GCCCATCGGC CTOCCATAC AATCGATAGA TTGTCCCACC 7GATT0CCCC 
CCCGTAGCCC CAAQ6GTATC TTAGCTATCT AACAGCGTGG ACTAACGGOC 



6€01 ACATTMCCC GAGCCCATTT ATACCCATAT AAATCAGCAT CCATOTTOGA 
TCWASACCO CTCCGGTAAA TATCGCTATA TTTACTCCTA GCTACAACCT 



XhoX 

USl AHTAATCCC GGCCTCGAGC AAGACGTTTC CCCTTOAATA TCCCTCATAA 

tjSaitacco ccggacmcg ttctgcaaag gocaacttat accgactatt 



CTOl CADCCCPTCT ATTACTOrTT ATC TAACCAQ J^CIfnTIi:? 

CSKGOAACA TAATOACAAA TACATTCOXC TCTCAAAATA ACAAOTACTA 



Drazzz 



€751 CMRTMTTT T A T C TTCTCC AATGTAACAT CAOAOATXTr CAGACACAAC 
CTATWAAAA ATACAACACG TTACATTQTA GTCXCTAAAA CTCTCTCTTO 
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DrmXZX 

(801 CTCOCTTTCC CCCOOCCCCC ATTATTOAAO CATTTXTCUC CGTTATTOTC 
CAOOOAXAGC GGQGOOGOGO TMTAXCTTC OXAXATAOTC CCAATAACAC 



CS51 CCMCACCCC ATACXTATTT QAXTCTXTn AQX&AAATAA ACAAATAGGO 

a0;acxcgcc tatotatmx ctxacataaa TcrrmATT tctttxxccc 



»01 Ui^C C OC OCA CATTICCCCC AAAACTCCCA CCTOACGTCT AACAAACCAS 
C AaCGCQCCT CTAAACCCG C nTZCACOGT GCACTGCACA TTCTTTGOTA 



6951 TMSTMOCXrO ACATXAACCT A7AAAAATA0 GC6TATCACC ACCCCCTTTC 
AXA&TAJGTAC TG7AATT6GA 7A77TTTATC CCCATAGTGC TCCGGGAAAO 



7001 GTC 
CAC 
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pVK r013-C?(S) 

General Description 

Loe&l object 

Cr«atcdt 09/14/98 03:58PK 

Last iaoei21eatien Date: ? (ao data) 

leavtki 7073 ttp 

•texm9e typet Basle 

torn Circclar 



Restriction Map 
Ball: 1 site ^ 
Bdlilsfte 

Drautrlsrte g^^g^g 
Hind!n:1site 
Kaslilsite gg-=^ 
Kpnlrlsite gS*|g 
Narl: 1 she gg§^ 
PmlU 1 site g*gg| 
Pvul:lsite 
Saclfclsfte 
Sail: t she 

Xbaklshe 5"*^ 

Xmnlzl she ct^^^wSEg 

Ndel:2sites g^^Jg 

EcoRV:3sHes 

Sphl:3$ites 

NeoI:4snes 

BaniHl:6sItes 

Functional Map 

CDS (4 signals] 
CMVIES'UT 
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Starts SBG Endt 1139 
CMVIE IMT 

Start: 1X30 Endt 1640 
TbGH 

Start: 4090 £ndi 4642 

Kanr 

start I 6138 &:di €760 (Cor^lerRantary) 
Miscjcaturc (2 signals) 
CMV enhancer 

start! 248 Ends 885 

GP(S} 

start! 1B70 ^d: 4089 



Annotations 
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1 scccaccrrr cocicatcxc ggt»a*xcc tctgacacxt ccagctcccc 

MCOCOCXAX GCCACTACTG CCACTTTTGO ACACTCTG2A CCtCCWWOC 

*«i * ejkcaecC^aL CASCITGTCT GTAXCCgQXT GCCCCCXOCA. CAOACCCCO 
* ^^^^ ^^J^ CXrtCGCCTX CGOCCCTCCT CTGTCCtXeC 

ioi * •Kyifi«a:flCG TCACCCOOTG TTCCCOGGM TCGGOGCTGG CTTAACTATG 
J^^S^ IScCCcSc JJkCCGCCCAC ACCCCCGACC CAATTGMAC 

Kdel 

freccuCXSX GCAQATTCTA CTeASACTGC ACCATATGCO ffTGTCAAATA 

S^E^ SSciaaSt OACTCTCACG tcctatacgc ca^^^^» 

* Ball 

reeexcxexT OCGTAAeCXG AAAATACCGC ATCACATTCG CTATTCQCCA 
^•'^ S?^Sa JsSttcSc TTTTATGGCG TACTCTAACC CATAACCOOT^ 

«r ttgcItacst tctatccata tcataatatc tacatttata '^ggctW;^ 

AACCIATGCA ACATAGGTAT AGTATTATAC ATG7AAATAT AACC«yUSTAC ^ 

* ' * * ,C-rI>ta.TTA CCGCCAT6TT CACATTGATT ATIGACTACT TA-ITAATAOT 

ii^S^T C^]^ CTGT>A=TAA TAACTC^^ 

»«r VATakATrAC*CGVGTCATTA GTTCA7ACCC CATATATGGA OTCCGOGTT 

• KACTTAATG CCCCACTAAT CAACTACCGG C7ATATACCT CAAGGCGCAA ^ 

Vrii-i»fA-rv CGGTAAATGC CCCGCCTGGC TGACCCCCCA ACCACCCCCG 
kSSJaS GiScCCACCG ACTGGCGGCT J<^^^^^ _ 

Ndel 

Kcii GCASTACAIC AXCTCTXTCA TATGCCAACT ACGCCCCCTA TTGSrCTCAA 
* VckciMTJ^ CCCATTATGC CCAGTACATC ACCTTATCCC 



Ucol 



jfCl HMTO^TAC TTGGCACTAC ATCTACCTM TWTTCATCGC TXTTACCJTG 



701 ^T^^TCCGGT TTTCGCACTX CXTCXXTGOC CGTGGATXCC ^^T^^ 



HCOX 

GTi^ACCC <^<^^'^^^ fft^f^!. 

' \^ J rrA'frV^ orazGTCWiC ACCCCATTGA CCSTCAATCCG AGTTTCTTTT 
JgSSiS SSS JSSSSaACT GCAC^ 
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801 CGCACCAXIUl TCAACCCCAC TTTCCMAXT GTCGTAACXA CTCCGCCCCX 
CCGTGG77TT AGTTCOCCTQ MAGGTTT7A CAGCXTTCTT a\GGCCCGaT 



851 TTGXCCCAIU^ TCCCCGCXAG CCGTGTACGC TGGGAGCTCT ATA7AAGCAG 
AACTGCGTTT ACCCGCCXTC CGCACXTCCC ACCCCCCAOX TATAKCOTC 



501 AGCTCCTTTA CTGAACCCTC AGATCGCCTC GAGAC6CCAT CCACGCTCTT 
7CGAGCAAAT CACTTCCCAG TCTACCCGAC CTCTGCCGTA GCTCCGACAA 



SacXl 

551 TTGACCTCCA TAOAAOACAC CG3GACCCAT CCAGCCTCCC CGGCCCCGAA 
AACTGGAGGr ATCTTCTCTC GCCCTCCCTA GGTCCCACGC GCCCCCCCTT 



1001 CGGTCCATTG GAACOC6GAT CCCCCGTCCC AAGAGTCACC TAAGTACCCC 
GCCACGTAAC CTTGCGCCTA A6GG5CACCC TTCTCACTCC ATTCATCOCO 



SphI 

X051 CTATAGACTC TATAGGCACA CCCCTTTGCC TCCTATGCAT GCTATACTCT 
CATATCTGAC ATATCCGTGT GGCCAAACCG ACAATACGTA CGATATGACA 



1101 T^rrGGCTTC GGGCCTATAC ACCCCCCCTT CCTTATCCTA TAGGTCATOG 
AAAACCGAAC CCCGGATATG TGGCCCCGAA GGAATACGAT ATCCACTACC 



1151 7ATAGCTTAG CCTA7AGCTG 7GGG7TATTG ACCATTATTG ACCACTCCCC 
ATATCGAATC GGATATCCAC ACCCAA:?AAC TGCTAATAAC TGGTGAGGGG 



1201 7ATTCCTGAC CATACTTTCC ATTACTAATC CATAACArCG CTCTTTGCCA 
ATAACCACTC CTATGAAACG TAATGATTAG GTATTCTACC GACAAACGGT 



1251 e;U^CTATCrC TATTGGCTAT ATGCCAACAC TCTGTCCTTC AGAGACTCAC 
CTTGATACAG ATAACCGATA TACCCTTATG AGACAUtjAM uCTCTGACTG 



1301 ACCGACTCTG TATTTTTACA GGATGCGCTC CCATTTATTA TTTACAAATT 
TGCCTCACAC ATAAAAATGT CCCACCCCAG CCTAAATAAT AAATGTTTAA 



1351 CACATATACA ACAACCCCCT CCCCCCTGCC C6CACTTT7T ATTAAACATA 
GTGTATATGT T GV A'CCGGCA GCCOOCACGG CCCTCAAAAA 7AATTTGTAT 



1401 CCGTGGCATC TCCACCCGAA TCTCCGGTAC CTGTCCCGGA CATGGGCTCT 
CGCACCCTAG ACCTGCGCTt AGAGCCCATC CACAAGGCCT OTACCCCACA 



1451 TCTCCCGTAG CGGCGGAGCT TCCACATCCC ACCCCTGGTC CCATGCCTCC 
AGACGCCATC COCGQCTCCA AGGTCTAGGC TCGGGACCAG GGTACGGAOG 



1501 ACCCGCTCAT GGTCGCTCGG CAGCTCCTTG CTCCTAACAG TCGAGGCCAC 
TCGCCCAGTA CCAGCOAGCC CTCGAGCAAC GAGGATTCTC ACCTCCCOTC 



1551 ACWAGGCAC AGCACAATCC CCACCACCAC CACTCTGCCG C ACAAQ QOCG 
TGAAICCCTG TCCTCTTACO CCMOTGOTC GTCACACGGC GTGTTCCOOC 



1601 TGGCCGTAGQ CXATQTGTCT CAAAATGAOC C?GQAgATTO GGCTCGCACQ 
ACCGCCATCC CATACACACA C TTTT A CTCQ CACCTCTAAC CCCACCCTOC 



1651 CCTGACGCAO ATGGAACACT TAAGGCACCQ GCAOAAOAAO ATCCA0GCA6 
C6ACTCCG1C TACCTTCTGA ATTCC6TCGC CGTCTTCrTC TACGTCCCTC 
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N'COl 



neoX 



Sail 

Stei: Bell EcORV 



^•c-. ZUrt_.iiiH_ TGCACTCACC ffTCCTCQACA CCTGTGATCX CXTATCCCCO 



1951 



rsssss sisis^ 



-nlTHLrTO. TCCXTOCCTT TOCeiCtMT OAKMCAGC 

l^"^ "^^^ 



SSS?^?i 

'si^c's^i ™= . 

SCOKV 

_r7"irZZ/.^»^« r^GXCAAAfiC GTT0GGGCT7 CAGATCTGGT 

• — CTATCJkAGCA G»GAATCGC CTGAAAAWC 

SSSSS iiSSSS StI^ CCTCXTXCCC CACT^ 
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2S01 ArrrrccTGA ccgggduitc ocattcttga tattggctax accaaxcga.^ 

TAUAOGACT CCOCCXTTAG CCCAAOHACT ATAACCCATT TCGT7TCCTT 



2551 ACGTTCCT7C AATCACCCCC CATTCGAQAC CCAGCAAACT ACA»CAAAA 
TGCAAGCJUIG TTACTOOOaC CTAAGCZCTC CGTC6TTTGA TCTGACTTTT 



2601 TACATCAAGT TACTATGCCA CATCCTACTT GGACTAC6AA ATCGAAAATT 
ATCTACTTCA ATGATACGGT GTAGGATGAA CCTCATGCTT TAGCTTTTAA 



2651 TT GG T GC T C A ACACTCCACG ACCCT7TTCA AAATTAACAA 7AATACTTTT 
AACCACC&CT TCTGAGOTa: TGGGAAAA6T TT?AATTGT7 AT7ATGAAAA 



2701 C77CTTCTCG ACACCCCCCA CACGCCTCAC TTCCTT TT CC AGCTGAA76A 
CAAGAAGACC TCTCCGGGGT GTGCCGAGTC AAGGAAAAGG TCGACTTACT 



2751 rACCATTCAA CnCACCAAC AGT7GAGCAA CACAACTGGG AAACTAATTT 
ATCCTAAGTT GAACTCGTTG TCAACTCG7T GTGTTCa^CCC TTTGATTAAA 



2601 GGT'XACTAGA ?GCTAAYATC AATGCTOATA T7GGTGAA7G GGCTTTTTGG 
CCTCTGATCT ACGATTATAG TTACGACTAT AACCACTTAC CCGA&AAACC 



2851 GAAAATAAAA AAATCTCTCC GAAGAACTAC CTGGAGAACX GCTGTCCTTC 
CTTTTATTTT TTTACWSAGG CTTGTTGATC CACCTCTTCT CGACACyUUlG 



2901 GAAACTTTAT CGCTCAACGA GACAGAAGAC CATGATGCCA CATCG7CGAG 
CTTTCAAATA CCGAGTTGCT CTGTCTTCTG CTACTACGCT CTACCAGCTC 



2951 AACTACAAAG GGAAGAATCT CCGACCGCGC CACCAGGAAG TATTCGOACC 
TTGATCTTTC CCTTCTTACA GGCTGCCCCG G7CCTCCTTC ATAAGCCTGC 



3001 TGGTTCCAAA GGATTCCCCT CGGATGGTTT CATTGCACGT ACCACAAGGG 
XCCA^^GTTT CCTAAGGGGA CCCTACCAAA GTAACGTGCA TCCTCTTCCC 



3051 CAAACAXCAT TGCCCTCTCA CAATTC6ACA GAAGGTCGAA GAG7ACATGT 
CmCTTGTA ACCCCA6AGT CTTAAGCTGT CTTCCAGCTT CTCATCTACA 



aiOl GAA7ACTCA6 GAAACTATCA CAGAGACAAC TGCAACAATC A7ACGCACTA 
CTTATGAGTC CTTTGATACT GTCTCTGOTG ACCTrGTTAG I'ATCCCTCAT 



3151 ACGGTAACAA CATGCAGATC TCCACCATCG GGACAGGACT GAGCTCCAOC 
TGCCA7TGTT GTACQTC7AG AGGTCCTAGC CCTG7CCTCA CTCGAGCTCG 



Kcol 



3201 CAAATCCTGA GTXCCTCACC CACCATCCCA CCAACCCCTG AGACTCAOAC 
GTTTAGGACT CAAGGACTGG CTCGTACCCT GGT7CGGCAC TCTCACTCTO 



3251 CTCCACAACC TACACACCAA AACTACCAGT GATCACCACC CAGGAACCAA 
GACCTCTTGG ATG7G7CGTT 77GATG6TCA CTACTGGTGG CTCCTTGGTT 



3301 CAACACCMCC GAGAAACTCT CCTOGCTCAA CAACACAAOC ACCCACTCTC 
GTCGTCCT06 CTCTTTGACA GGACCCAGTT CT J Cg C Tg C G TOGGTGAGXO 



3351 AOCACCCCAG ACAATA7AAC AACACCGGTT AAAAC7GT7T GGGCACAAQA 
TCCTGGGCTC TCTTATATTG TTCTCGCCAA TTTTGACAAA CCCCTGTTCT 
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3401 CTCCACAACC AACGCTCTJUl TAACTTCXXC AOTAACACCT XTTCTTCCCX 
CAC G T G TTCQ TTGCCACATT ATTCAXGtTO TCATTCTCCA 



3iSI CCCTTCGKCT 7COAAAXCGC AGCACAAOJIC AACTTXACAC CAGCCCCACO 
CGGAACCTGA ACCTTTTGCO TCCT CT TCT G WCAATTGTO CrCCCGGTGC 



3501 CGTAAATCCA ATCCCAACTT ACACTACTGG ACTCCACAAG AACAACATAA 
CCATTTACGT TACGQTTGAA TCTGATGACC TOACGWrTC TTGTTGTATT 



BdtiRZ 



3551 T GC T GC TCSO ATTGCCTGGA TCCCCTACTT TCCaCCCOCT CCAGAAGGCA 
ACGACGACCC TAXCCGACCT ACOGCATGAA ACCTGGCCCA CG7CTTCCCT 



3£01 TATACACTCA AGGCCTTATG CACAACCAAA ATCCCTTAGT CTCTGGACTC 
ATATGTGJICT TCCGGAATAC CXCTTGGTTT TACGOAATCA CACACCTGAfi 



3651 AGACAACTTG CAAATGAAAC AACTCAAGCT CTCCAGCTrT TCTTAAGGGC 
TCTGTTCAAC CrrrTACmTG MCyiGTTCCA CACGTCGAAA AGAArTCCCC 



17C1 CACGACCCAG CTGCCOACAT ATACCATACT CAA?AG<5AA5 CCCATAaKTT 
GTGCrCCCTC CXCGCCTGTA TA^GGTATCA GTTA7CCTTG CGGTATCTAA 



&B2QHI 



3751 TCCTTCTGCC ACGATGGCCC GGGACATCTA G3ATCCTGGG ACCACAT75T 
ACCAAGACCC TGCTACCCCG CCCTCTACAT CCTAGGACCC TGGCCTAACA 



3801 MCATTGACC CACATCXTTG GACCAAAAAC ATCACTGATA AAATCAACCA 
ACGTAACTCG CTCTACTAAC CTCGTTTTTC TAGTGACTAT TTTAGTTCG7 



35^1 AatCAICCAT GA77TCATCG ACAACGCTTT ACCCAATCAC CATAATCATO 
TTASTACGTA CTAAAGTAGC TGr-TGOGAAA TGGGTTACTC CTATTACTAC 



3901 ATXATTGCTC CACGGCCTGG AGACASTGGA TCCCTGCAGC AATAGGCATT 
WkTTAACCAC CTGCCCGACC TCTCTCACCT AGGGACGTCC TTATCCCTAA 



3951 ACTGGAAT7A TTATTCCAAT CATTGCTCTT CTTTGCGTCT GCAAGCTCCT 
TGACCMAAT AAf AACGTTA CTAACGACAA GAAACCCAGA CGTTCGACQA 



BamHT 



iOOl tWrWAATA TCAGAATTCC AGCACTCGCO GCCGTTACTA GTGGATCCGA 
AACAACTTAT AGTCTTAACG TCGTGACCGC CGGCAATQAT CACCTAOOCT 



UarX 

4051 CCTCGGATCC AAGCTCTACA CCAGCCCCCT GGATCCACAT CTGCTGTGCC 
CGASCCTAOG TTCCAQATCT CCTCCGCGGX CCTAGGTCTA GACGACACCG 



ClOl 1TCTACT7GC CACCCATCTG TTOTTTGCCC CTCCCCCCTC OCTTCCTTGA 
AACATCAACC GTCCGTACAC AACAAACGGG GACCCGCCAC GGAAGGAACT 



ilSl CCCTCCAAGG TCCCAC2CCC ACTGTCCTTT CCTAAffAAAA TfiXCGAAACT 
CCCACCTTCC ACGGTGACCG T6ACACCAAA GGATTATITT ACTCCTTTAA 
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£201 GCATCCCXTr CTCTQACTXC OTCTCATTCT AWCTGGGGG CTGGGGTCCC 
S^liS^CtCXlC CACAOTAACK TAAGACCCCC CACCCCACCC 



Spill 

CCACCACACC AXGCCC5GXGC ATPGGGAWyi CAATACCAGG CArCCTGGGG 
SSCCTGTCG TT^CCTCC TAACCCTTCT OTTATCGTCC CtACGACCCC 

4101 ATGCOOTGGO CTCTATCCGT ACCCACGTGC TCAAGAATTC ACCCGGTTOC 
' * * ^r«cfiGrcjL GAAAGAACCA GCCACATCCC CTTCTCTCTG ACACACCC7G 

^40^ TCCACGCCCC TGCTTCTTAG TTCCAGCCCC ACTCATACOA CACTCATAGC 
* iScTGCGGCO ACCAACAATC AAGGTCGCCG TGAGTATCCT GTGAKATCC 

"^IrftXMCe TCCGCCr»CA ATCCCACCCC CTAAAGTACT TGGAGCCGTC 
i!*^01 ' TCTCCCTCCX: TCATCAGCCC ACCAAACCAA ACCTACCCTC CAAOAGTOGG 
J «i ' VagxL^AA AGCAAGATAG GCTATTAAGT GCAGAGGGAC AGAAAATCCC 

^jffti TrrAACXTCT GA6GAACTAA TCAGAGAAAT CATAGAATTT CTTCCCCTrC 

GASCCACTGA CTGAGCGACC CGAGCCACCA ACCCGACGCC CCTCGCCATA 
1701 ' CAGCTC^^ AAACGCGGTA ATACGGTTAT CCACACAATC ACGGGATAAC 

iVsi * GCJ^GCAAAGA ACA7GTGACC AAAAGCCCAG CAAAAGGCCA CGAACCGTAA 

iiflfti * ^^Is^nglG TTCCTGGCGT TTTTCCAITAG GCTCCGCCCC CCTGACGACC 

^Vm * MCXCJ^^ ACTCAOACCT OGCGAAACCC GACI^GACTA 

iVorVJA^^^A^:^ aggJ:^^ ISSSSSS ssssss 

ATTCC7ATCG TCCGCAAAGG CCGACCTTCG AOGOACCACG CGAGACGM:A 
* rrT-evPAcCC GATACCTCTC CGCCTTTCTC CCTTCGGGAA 

SSSSS 

Roo^ * BPtaaBcaeT MCTCWltCC TCACCCTCTX OGTATCTCAG TTCG6T0TAC 
SSSSS SSJSJiS ACWOCOXax CCMACaUStC AX« 
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S051 CrCGTXCCCT CCAXOCTGGG CtCTCTOCJlC CXJi C C C CCCa KC&GCCCGA 
CAGCAASCGA GGTTOOACCC GACACACCTG CTTGGCCCQC AAGTCGGGCT 



5101 CCOC7GCOOC TTA3CCGGTA ACTXTCOTCT TGAGTCCAAC CCGOTAAGAC 
GGCGACGCG6 AXTAOGCCAT ACTCAGCT7G GGCCATTCTG 



5151 ACCftCTTATC CCCACTGGCA GCA6CCACTC GTAACAGSXT 7A0CAGACCC 
TOCTGJUUUU; CG6TGACCOT CGTCCC7QAC CATTCTCCTA ATCCTCtXUSC 



5201 ACGTASGXAG 6CGGTGCTAC AGAGTrCTTO AAGTC6TGGC CTAACTACGG 
TCCAZACATC CGCCACGATG TCTCAAGAAC TTCACCACCG QATTGATGCC 



52S1 CTACACCAGA AGGACAGTAT TTGGTAtCTG CGCTCTSCTG AACCCAGTTA 
GA7GTG»C7 7CC7GTCATA AACCATAOAC GCGAGACGAC TTCGGTCAAT 



5301 COTCCGAAA AAGAOTTGG? AGCTCTTGAT CCGGCAAACA AACCACCOCT 
GCAAGCCTT? TTCTCAACCA TCGAGAACTA GGCCGTTTG7 TTCGTGGCGA 



5351 CCTAGCGGTC GTTITTTTGT TTGCAACCAG CAGATTACGC CCAGAAAAAA 
CCATCGCCAC CAAAAAAACA AACUITCGTC CTCTAATGCG CCTCZ'I I"XTT 



5401 ACSATGTCAA GAA6ATCCTT TGA^CTTTTC TACGGGGTCT CACGCTCACT 
TCCTAGACTT CTTCTAGGAA ACTAGAAAAC ATGCCCCAGA CTGCGACTCA 



5451 GGAACGAAAX CTCACGTTAA GGGATTTTGG TCATGAGAT7 ATCAAAAAGG 
CCTTGCTTTP GAGTCCAATT CCCTAAAACC AGTACTCTAA TAGTTTTrCC 



5S01 ATCTTCACCT AGATCCTTTT AAACTAAAAA TCAAGTTTTA AATCAATCTA 
TAGAACTGGA TCTAGGAAAA TTTAATTTTT ACTTCAAAAT TTAGTCACAT 



5S51 AAGTATATAT GACTAAACTC CGTCTGACAC TTACCAATCC TTAATCACTG 
TT»TATA2A CTCATTTGAA CCAGACTCTC AATGCTTACG AATTAGTCAC 



5601 AGGCACCTAT CTCACCGATC TGTCTATTTC GTTCATCCAT AGTTCCCTGA 
TCCCTCGATA CAGTCGCTAC ACAGATAAAG CAACTAGGTA TCAACGGACT 



5551 CTCCCGGGGG CGGGGGCCCT GAGG7CTCCC TCGTCAAa^A GCTGTTGCTG 
CAGGCCCCCC CCCCCCGCGA CTCCAGACGG ACCACTTCTT CCACAACCAC 



5701 ACTC\TACCA 6GCCTCAATC GCCCCATCAT CCAGCCAGAA AGTGAGGGAG 
TGAG7ATGGT CCCGACTTAG CGGGGTAGTA GGTCGGTCCT TCACTCCCTC 



5751 CCACMTTGA TGACAGCTTT CTTGTAGCTG GACCAGTTGG TGATTTTGAA 
GGTOCCAACT ACTCTCGAAA CA&CA7CCAC CTGGTCAACC AC7AAAACTT 



5801 CVa ' ^ T GC ' I ' IT GOCACGGM^ OGTCTCCOTT GTCCGGAAGA TCCCTGATCX 
GAAAAOCXAA CGCTGOCTTG CCAGACOCAA CAOCCCTTCT ACGCACTAGA 



5951 GATCCTTCAA CTCAOCAAAA GTTCGATMA «CAACAAAO CCCCCCTCOC 
CTAGGAAGTT GAGTCCTTTT CAAGC7AAAT AAGTTGTTrC GGCGOCAGGG 



5901 GTCAAGTCAG COTAA7GCTC TGCCACTGCT ACAACCAATT AACCAATTCT 
CAGTTCAGSC GCAtTACGAG ACGGTCACAA TGTTGGTTAA TTGGTTAAGA 



5951 QATTACAAAA ACSCAXCGAG CATCAAATGA AACTGCAATT TATTCA1ATC 
CTAATCTTTT TCAGTACCTC GTACTTTACT TTCACGTTAA ATAAGTATAC 
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---.,1— XTXCCXTATT TTTOJAAAAS CCOTTTCMT AATGAAGOAfi 

tosx''i^i^>^<miyii^ 2i?2£S^ SiiSiISS? 

TTTTCWnCC CTCCCTCJUkG CTATCCTACC OTTCTACG*C CATACCC*1» 

all ' Kttwi^W^TCCMC i^lCIATACAA CCIATTAATT ^CCCCTCCTC 
CeCTAXGGC? CAaCACCTTQ SRCTOITGTS CCXTAATTAX AGCCACOQ 

/. c, * VxAAXTXaCG nXTCMCra ASMUITCJLCC XTCftCTOACC actgaxtccq 
}5J5SSkC SniSlSc ICmXCTTGO TACTCACT^ 



CaOl GTtaOAWKJ CXAAAQOTX TCCAITICTT 



Hlndtll 

SvCTCTTAM OTTTTcSjS icCTiixGAA AGGTCTCXAC AXSTTCtCCQ 

Msi * CAGCCmAC CCTCGTCA"=C AAAATCACTC GCATCM^CX AJJCCOTTATT 
StOTAATC COMCWWAS TTTTAfiTSAG CCTAGTTeCT TTGGCXXTXX 

* Pvul 

«*Bi exTreCT(UVT 5CCGCCTG*C CCASACOUm TACCCSXTCG CTCTTAAAA6 

SiSaScrl acgcgSctc cctctocttt atscgctaoc cacaaiktc 

" RxrJLXTTACa. MCAGGAXTC GAATCCAACC GGCCCAGOMl CXCTOCCACC 
SSSSct ttctwccag CTTACGTTGC CCOCGTCCTP GJGACCCTCG 

Kial ' a^'XSMZKk TATTTTCACC TGWVTCAGGA TX-TTCTTCTA ATXCCT5CAX 

SI^JSt ItaajSItgs acwagtcct ataxgaagat ^^-tcgacck ^ 
cVsi* VcCTcmicVccG^GAT^ cxG-aGraxc T-vrciToeA tcxKAooKa 

^ScWAAG CGCCCCTAOC GTCACCACTC XCTCCTXCGT ACTAGTCCTC^ 

c'^fli * *Tv:ecj^xxx xTCcrrcxTC ctcggaacao gcxtaaxttc cgtcacccxg 

tISgAActS CAGCCTTCTC CGTAT^ 

* I* ' *J1tA«CTQX CCATCTCATC TGTXACATCA T7G0CAACOC TACCTTTGCC 

SSaSotaS XCATTCTXGT XXCCCTTCCO ATCCXXMMG^ 

Clel 

fifiOl AlCWtCACA, AXCXACTCTG GCGCATCGGG CWCCCATAC **JCGATJ*X 
raCAMCTCT TTCTTGXGAC CCCOTAOCCC a\AGGGIXTG "^^^ 

««i ' '•^weetice tcxttgcccc xcxttxtccc oxccccattt atacccatat 

SSSSSS ISSSSS TOTAATA^^ 

Mfli ' V^ATCJlCCAi ccxTGrrccA xtttaxtccc ggcctcoaoc aacaoqtrc 

* SSSS^ oSxcS^ TAAXTTAOCO CC^^ 

* Vrw^ilixTl -^XJOTCXTXX CACCCCTXGT XTTACTSTTT AtCTAXOCAC 
ACWX^» CToScSxACA CAXTOAC^ 

/•ft^* *M!lmwxi «srT^ CATXTATTTT TATCTTGXCC AATOTAACAT 
;SSS[S! SSctA CTATATAAAA ATAOAACACO TTACATTO^ 
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€851 CKSXGXrm GAGACXCMC CTCQCTTTCC CCCCCC CCCC ATTAWGAAC 

CTCTCTJUUm CTCTOtcrrC cacccjuuusc occcgogcct, TAATAXcrrc 



€901 CATCTCCAC COTTATTOTC TCASGAGCGC A2ACATATTT CAATC7ATTT 
CTAAAIASTC CCAATAACAC AOTACTCGCC TATCTATAAA CT2ACATAAA 



€951 ACAAA&ATAA ACAAATAGGG CTT CC CCCCA CATTTCCCCG AAAAGTGCCA 
TCTTTTTATT TGTTTATCCC CAAOGCOCGT CTAAAOGGGC TTTTCACCCT 



7001 CCTCA CO TCT AAGAAACCAT TATTASXaiTC ACATTAACCT AT AAAAA TAG 
C*I7^mjr?^^ TTCTTTOOTA A7AATACTAC TCTAATTGGA TATTTTTATC 



70S1 CCG7A3CACO AGCC CCT TT C CTC 
CGCATAG7GC TCC6GGAAAO CAC 
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SEQUENCE LISTING ID NO: 3 

General DescripUon 

SKJW pVR 10X3 •GP(Z) 
local dbj«ec 

Cr«fttadt 09/15/98 OSsOSM 
Last HDdiricatlon Dates T (no data} 
Icsgtbs 738S top 
8tora9« type: Baaic 
fonr.: Circular 
CocBBanta 



CCGGCCCC 
CGCCCCC6 




GGCG&C 
TCTAGX 

CTCG7vG 
CTCGCTC 

GAIATC 
CTATAG 

CP XTCC 
GG73GCC 

CATATC 
GTXTXC 

GCKTGC 



Functional Map 

C0S(4sIgnatG) 

CMVlEffUT 

Starti eee Snd: 1129 
CMVIE INT 

Starts 1130 Sndi 1640 
TbOH 

St&rts 4302 Esdt 48$4 

Kanr 

Sttrts 6350 End I 6972 (CoKpleccantary) 



-23- 



wo 99/37331 



PCT/US99/01382 



Mi5Q.fcatur signals) 
CMV enhancer 

Start^t 348 Scd: 885 
GP(Z) 

starts 1870 Bed: 4301 

Annotations 
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1 TC CCCC CTTT CCOTOATGAC GGTGMAACC TCT O CACKT GCACCTCCCO 
MeaCGCh^h GCCACTACTQ CCAC7TTTGG XaACT O TGTA COTCOAGCSC 



51 GACACOGTCX CAOCTMTCT OTXXOCGOAT CCC C CGXCCX GACA A CC CC Q 
C lCg GCC X C T GTCGJLXCAOA CATTCGCCTX COGCCCSCGT C7C7TCGGQC 



101 rCMCGGCGCG TCMCCCCTG TTGGCGGGTG TCOGGGCTGG CTTAACTXTG 
MICCCGCOC AG7C00CCAC AXCCGCCCAC AGCCCCGACC GAXTTGATAC 



051 CGOCMC&GX GCXCAT?GTA CTCXaXGTGC XCOITATGCC GTGTGAXXTA 
C0CG3MTC7 CCTCTAACAT GACTCTCACG TGG7ATACGC CACACTTTAT 



301 OCQCACACAT GCGTAAaOAO XAAATACCGC ATCAOATTGO C7ATTCGCCA 
GGCGTCTCTA COCATXCCTC TTTTATCCCQ TACTCTAACC CXTAACCCCX 



251 TTCCATACCT TGTASGCA7A TCATAATATC TACATTTATA TTCCCTCATG 
AACGTATCXIA ACATAGGTAT AG7A7TATAC ATGTAAATAT AACCGAGTAC 



301 TC CAACATTA CCGCCA TCTT CICATTGATT ATTGACTAGT TATTAATACT 
AGGTTCTAAT GCCGGTACAA CTGTAACTAA yAACTGATCA ATAATTATGA 



351 AATCAATTAC GCCCTCATTA GTTCATAGCC CATA7ATGGA CTTCCCCCTT 
TTA0TTAATG CCCCAGTAAT CAACTA7CGG C7ATATACCT CAAGGCGCAA 



401 ACATAACTTA CCGTAAATCG CCCGCCTGGC 7GACCGCCCA ACGACCCOCG 
7G7A7TGAA7 CCCATTTACC GGGCGGACCC ACTGGCCCGT TCCTCCGGGC 



451 CCCATTCACG TCAATAATCA CGTATCTTCC CATAGTAACG CCAATAGGGA 
G5GTAACTGC AGTTATTACT GCATACAACG OTATCATMC GCTTATCCCT 



501 C7T7CCATTC ACGTCAATGG CTCGAGTATT TACGCTAAAC 7GCCCACTTC 
CAA&GCTAAC TCCAGTTACC CACCTCATAA ATCCCATTTG AC6GGTCAAC 



KdeX 



551 CCAGTACATC AACTGTATCA TATCCCAAGT ACGCCCCCTA T7CACGTCAA 
CG7CAT0TAC TTCACATAGT ATACGCTTCA TGCGCCGGAT AACTGCACTT 



601 TGACCGTAAA TGGCCCCCCT 6CCA77ATCC CCAGTACATG ACCTTATGGC 
ACTCCCATTC ACCCCCCGGA CCGTAATACG GGTCATG7AC TGGAATACCC 



KcoZ 

651 ACTTTCCTAC TTGCCAGTAC ATCTACC7AT TACTCA7CGC TATTAOCATC 
TGAAAGGATG AACCGTCATO TAGATGCATA AXCAQTAGCG ATAATCGTAC 



KeoZ 

701 CTGATCCCGT TTSCGCAOTA CATCAATGGG CGCCGATACC CGnTGACTC 
CACX»OGCCA AAAOCOTCAS GTACnACOC GCACRATCO CCAAACTQAG 



751 ACGCCCJCTT CCAACTCTCC ACCOCATTGA COTCAATCGO AGTWGTTTT 

Tfft rrr^A ftft cgttcagagg tcgggtaact gcactxaocc tcaaacaaaa 



801 COCACCAAAA TCAACGGQAC TTTCCAAAAT GTCGSAACAA CTCCGCCCCA 
CCCTCGTTTT ACT7GCCCTC AAAGG7TTTA CAGCAT7GT7 GAGGCGGGCT 
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isl ttgacgcxxx togcccotao ccctotacoc tccgmwtct atat aagcxc 
juu:mcottt acccoccmc cocacatccc accctccaoa tatattccstc 



901 ACCTCCTTCA CKAXCCGXC AGATCGCCTO GMACCCCAT CCACCCTGTT 
TC»CCAAAT CXCTTCCCAG TCTACCCGJUC CXCTCCOGTA CCTCCGACAA 



SaeXZ 

fSl TTOACCTCCA TAQAAfiACAC CCCGACCCAT OCAGCCTCCO CGCCCCCGAA 
AACTGGA5GT AT C TTCT G TO CCCCTOGCTA GGTCCCAfiOC GCCCCCCCTT 



1001 C CG T GC ATTG CAACCCOGAT CCCCCGTCCC AACAGTCACC TAACTACCGC 
GCCACCTAAC CTT GCG CCTA AGGGGCACCG TTCTCACTGC ATTCATGGCC 



Sphl 

1051 CXATAGACTC TATAGOCACA CCCCTTTCCC TCTTATGCAT CCTATACTGT 
CA2A3CTCAC ATATCCCTGT GCCCAAACCO AliAATACCTA CQATATCACA 



1101 TTtTCCCTTG CCGCCTATAC ACCCCCCCTT CCTTATGCTA TACGTGATCC 
AAAACCGAAC CCCGGATATG TCCCCGCGAA CGAATACGAT ATCCACTACC 



1151 TATACCTTAC CCTATACGTQ TOGGTOAMG ACCATTATTG ACCAC7CCCC 
ATATCGXATC GGATATCCAC ACCCAATAAC TGGTAATAAC TGGTGACCGG 



•201 TATTGGTCyiC GATACTTTCC A^TACTAACC CATAACATGG CTCTTTGC^ 
ATAXCCACTG CTATGAAACC TAA7CATTAG CTATXGTACC CAGAAACGCT 



1251 CAACTATCTC TATTCGCTAT ATGCCAATAC TCTGTCCTTC ACAOACTGAC 
CTTGATACAG ATAACCGATA TACGGTTATG AGACAGGAAG TCTC7GACTC 



GT7GATACAG ATAACCGA7 

150- ACGGAC7C7C i i tta^cA GGATGGGGTC CCATTXATTA TTTACAAATT 
•TGCCTGACAC ATAAAAATCT CCTACCCCAG GG7AAATAAT aAawiaxAA 



1351 CACATATACA ACAACGCCC7 CCCCCCTCCC CGCAGTTTTT ACTAAACM 
CTCTATATGT TGTTGCCCCA GGGGGCACGC CCGTCAAAAA TAATTTGTAT 



1401 CCCTGGGATC TCCACCCGAA TCTCGGCTAC CTGTTCCGGA CATGGGCTCT 
SScCCTAG AGGTOCGCTT AGACCCCATC CACAAGGCCT GTACCCCAGA 



1451 rCTCCCCTAG CCOCGGAOCT TCCACATCCG ACCCCTGCTC CCATCCCTCC 

ISagccStc cocgcctcca acgtgtaggc tcgggaccag ggtacogagg 



1501 agcccctcat cctcgctccg cagctccttg ctc«ja»c tcga^cao 

TCCCC6ACTA CCACCCAGCC CTCGAGOAAC GAOGATTCTC ACCTCCGGTC 



1551 ACTTAGGCAC AGCACAATCC CCACCA C C A C CACTOTGCCC CA CAAG CCCG 
TCAMCCOTO TCCTCTTACG CCTGOTCGIG OTCACACCCC CTCMCCOGC 



IfiOl TGGCCOTACG GTA3CTGTCT GAAAAT CAQC GTOCAC^ ^SE^?^ 
ACCCCCATCC CATACACAGA CTTTTACTCO CACCTCTAAC CC6AGCGTGC 



1€51 CCTGADGCAC ATGCAACAM TAACCCAGCG CC^SAMAA^ tl^S^^ 
CGACTCCCTC TACCTTCTGA ATTCCGTCCC CGTCTTCPTC TACCTCCOTC 



1701 CTOACrrCCT GTATTCTGAT AACAGTCACA OGTAACTCCC GTTGC6QMC 

SctouSa Staagacta ttctcactct ccaticagoo caacgccacc 
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Rp&X 



1751 


TGTTAXCGGT 
XCAATTGCCA 


CCXCCGCACT CTAOTCTGAG CAGTACTCGT TOCTGCCOCG 
CCTCCCOTCX CX7CAGACTC CTCATGAGCA ACCXCGGCCC 








Ncor 


1801 


CGCGCCACCA 
CCGCGGTGCT 


GXCATAATAG 
C?GTA3TA9C 


CTGACAGACT AACACACTGT TCCTTTCCAT 
GACTGTCTCA TTGTCTCACA AG3AAAGCTA 




seox 




telZ EcoKVKotZ 


1851 




TGCACTCACC 
ACCTCAGTGO 


trrCCTCGACA CCTGTGATCA CA7ATCGCGC 
CAGCAGCTQT CCACACTACT CTATACCCOC 



HarZ 



notX Xbaz icasl 



1901 CCGCTCTAGA CCAOGCGCCT GGATCGATCC GCGATGAACA TTAACCCGAC 
GCCGAGATCT CGTCCCCCCA CCTACCTAGG CGCTACTTCT AATTCGGCTG 



1951 AG7CAGCGTA ATCTTCATCT CTCtT»£ATT ATTTCOT^TG GAGAGTAGG6 
ICACTCGCAT TAGAACfAGA a\CAATCTAA TAAACAAAAG GTCTCATCCC 



20Q1 CTCGTCAGGT CCTTTTCAAT CGTGXAACCA AAATAAACTC CACTAGAAGG 
CAGCACTCCA CCAAAAGWA GCACATTCCT TT7ATTTGAG CTGATCTTCC 



2051 ATATTGTGGG CCAACAACAC AATGCCCCTT ACACGAATAT TCCAGTrACC 
TATAACACCC C677GTTGTG TTACCCGCAA TCTCCTTATA ACGTCAATCG 



2101 TCCTGA7CCA TTCAACAGGA CATCATTCTT TCTTTGGCTA ATTATCCTTT 
aGCACvAGCT AASTTCTCCT GTAGT^ACAA AGMACCCAT TAATAGGAAA 



2151 TCCAAACAAC ATTTTCCATC CCACTTCGAG TCAXCCACAA TACCACATTA 
AGCTTTCT7C TA/AAGGTAG GG7CAACCTC ACTAGGTCTT ATCGTGTAA7 



2201 CACCTCAGTC ATGTCCACAA ACTA Q TTTGT CCWACAAAC TGTCATCCAC 
CTCCAATCAC TACACCTGTT TGATCAAACA GCACTGTTTG ACACTAGGTG 



2251 AAATCAATTG ACATCAGTTC GACTGAATCT CGAACCGAAT GCACTGGCAA 
TTTACTTAAC TCTAGTCAAC CTGACTTAGA CCTTCCCTTA CCTCACCCTT 



2301 C76ACCTGCC ATCTGCAACT AAAACATCG5 CCTTCAGCTC CGGTCTCCCA 
CACTGCACCG TAGACGTTGA TT77CTACCC CGAACTOCAG CCCACAGGCT 



2351 CCAAAGCTCC TCAATTATGA ACCTGGTGAA TCGGCTGAAA ACTCCTACAA 
GGmCCACC AOTTAATACT TOOACCACTT ACCCOACTTT TOACGATCTT 



2401 I C TT Q AAATC AAAAAXCCTG ACCGCA0IGA CTCTCtACCA CCAOCGCCAfi 
AGAACmAO TTTTTTGGAC TCOOCTCACT CACAfiASCCT CGTCGCGQTC 



2«S1 ACGGCASTCG CGGCXTCCCC CGGTGCCCGT ATGTGCACAA ACTATCAOGA 
TCCCCTAACC CCCGAACGGG GCCACGGCCA 7ACACGTGT7 KATAGXCCT 



2501 ACGGGACCGT GTGCCCQAGA CTTTGCCTTC CAT AAAGAC C OTGCTTTCTT 
TGCCCTGGCA CACGGCCT C T GAAACGGAAO GXATTTCTCC CACGAAAGAA 
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2S81 eCTCTATCXT CCaCTTCCTT CCACXOTIAT «ACCGAGGX ACGACTTTCC 
CGWCATACTX GCTCJULCOAA CCTOTCAM A CATOOCTCCT TGCTGAMGC 



2B0X CTCAAOGTCT CGTTCCXTTT CTGATACTGC CCCAAOCTAX GAAGGACTTC 
CACTTCCACJl GCAACCTAAA CACTATSftCC CGGTTCCATT CTTCCTGAXC 



2651 TPCAGCTCAC ACCCCTTGAG A6AC0CGG7C XATOCAACCC AGGACCCC7C 
AAGJCCXCTG TCCCGAACTC TCTCGGCCXO OTACCTTGCC TCCTCCGCAO 



ECORV 

2701 TACTGGCTAC TATTCTXCCX CAATUGATA TCAGCCTACC GGTWTCGAA 
ATCACCGATC ATAACATGGT CTTAATCTAT AGTCC6ATCG CCAAAACCTT 



2751 CCAATGACAC AOAGTACTTQ TTCCAGCTTC ACAAITTCAC CTACCTCCAA 
GSTTACreTG TCTCATGAAC AAGCTCCAAC TGTTAAACTG CATCCAGGTT 



2801 CTTGAAffCAA GATTCACACC ACAGTTTCTC CTCCACCTGA ATGACACAAT 
GAACTTAGTT CTAAGTCrCC TGTCAAAOAC GACCTCGACT TACTCTGTTA 



2851 ATATACAACT CGGAAAACC A GCAATACCAC CCC AAAA CTA ATTTG GAACC 
TATATGTTCA CCCmTCCT CCTTATGGTC CCCTTITCAT ^AAACCTTCC 



2901 TCAACCCCGA AATTCATACA ACAATCGGGO ACTGGGCCTT CTCGGAAACT 
ACTTGGCCCT TTAACTATGT TCTTAGCCCC TCACCCCGAA GACCCTTTGA 



2951 AAAAAAAACC TCACTAGAAA AATTCGCAGT CAAGACTTCT CTTTCACAGT 

TrrrTrTXGO actgatcttt ttaagcctca cttctcaaca gaaastgtca 



3001 TCTA3CAAAC GGACCCAAAA ACATCACTCG TCAGACTCCG GCCCOAACTT 
ACATASTTTG CCTCGGTITT TGTAGTCACC ACTCTCAGGC COCQCrTGAA 



3051 CTTCCCACCC ACGGACCAAC ACAACAACTG AACACCACAA AATCATGGCT 
GAAGGCTGCG TCCCTCCTTG T G TT G TTGAC TTCTGGTCTT TTACTACCGA 



J 101 TCAGAAAATT CCTCrCCAAT GCTTCAA6TG CACAGT CAAG GAACGCAAGC 
ACTCrrrTAA GGACACGTTA CCAAGTTCAC GTG7CAC7TC CTTCCCTXC 



3151 TOCAGTGTCG CATCTAACAA CCCTTGCCAC AATCTCCACG ACTCCCCAAT 
ACGTCACAGC GTACATTCTT GGGAACCGTG TXAGAGGTGC TCAGGGGTTA 



3201 CCCTCACAAC CAAACCAGCT CCG GACAAC A CCACCCATJA ^Aa^CCCGTO 
GGCA£?XCrTC GTTTCCTCCA CCCCTGTTGT CGTGGCTATT ATCTGGGCAC 



3251 TATAAACTM ACATCTCTGA CCCAACTCAA OTTGAACAJ^ A«*fCOCAC 
ATAXKGAAC TCTAGAGXCT CCGTrCAOTT CAACWGTTC TACTGCCCTC 



2301 AACAfiACAAC GACAGCACAO OCTCCOACAC TCCCTCTOCC ACGACCCCAO 
WCTCTCTTO C l'UlWiGr C CGAGGCXCTO AOCCACACGG TCCTCGCCTC 



33SI CCSSHCrCCC AAAAOCACAC ^^^^^^^^^^ Cgjjgg^j^ SSTSiSIS 
GGCCTOOOGG ' i ' lU ' lCU r C T C TTCTCGTTGT CCTCGTTCTC GTGACTGAAa 



3401 CTGGACCCCC CCACCACAAC AAGTCCCCAA AACCACAOCG ASACCGCTCQ 
GACCTCGCGC GGTCCTGTTG TTCAOGGCTT TOOGTCTCCC TCTCG C GA CC 
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94S1 CAACAACAAC ACTCATCACC AAOATACCGG AGAACACAGT GCCACCAGCO 
GTTGTTOnO TOACTAOTOO TTCTASOGCC TCTTCTCTCX CCGI C GT C CC 



3501 GGAAOCXAGO CTTAATTACC AATACTATTC CTGOAOTCGC ACGAC7GATC 
CCTT CGAT CC CAATTAATCO TTA7GATAAC OACCTCACCO TCCTCACTAO 



3551 ACAflGCCCCA GAA6AACTCG AAQAOAAGCA ATTGTC^TG CTCAA CC C A A 
M?CCCCCCT CTTCTTGAOC IVC T CI'IC G I TAACAOTTAC CXCTrGGCTr 



3(01 ATCCAACCCX AATTTACATT ACTGOXCTAC TCACGATOAA GOTCCTGCAA 
nCGRGSCX TStMXGfCAA TGACCTGXTQ ACTCCTACTT CCACGACGTT 



3651 tCOCACJCCC CTQGATAOCA TATX7CGGGC CAGCAfiCCQA 6CGAATTTAC 
ACCC3QACCG GACCTATGOT ATAXAGCCCG CTCGTCGGCT CCCTTAAATC 



3701 ASASAGGCCC TAATCCACAA TCAAGATOGT 7TAATCTCTC GGTTGAOACA 
T A ' J CT CCCCC AWACOTGTT AOTTCTACCA AATTACACAC CCAACTCTGT 



3751 GCTGGCCA7£ GACACGACTC AAGCTCTTCA ACTGTTCCTG AGACCCACAA 
Ca»XCSOrTG CTCTGCTGAC •TTCGACAAGT TGACAAGGAC 5CTCGCTCTT 



3801 CSGASCTACO CACeKrTCA XTGCTCAACC GTAAGGCAAT CGAWCTTa 
GACTCCATGC GTCCAAAACT TAGGAGTTCS CATCCCCTTA ACTAAAGAAC 



3851 CTGCAOCGAT GCCGCGGCAC ATGOCACXTT CTGGCACCGG ACTCCTGTAT 
GACCTC GC TA CCCCGCCCTG rACCCTCTAA CACCCTGGCC TCACGACATA 



3901 CCAACCaiCAT GATTCGACCA AGAACATAAC AGACAAAATT GATCAGATTA 
GCrrCCTGTA CTAACCTGGT TCTTGTATTG TCTCTTTTAA CTACTCTAAT 



3J51 TTCATCAI'TT TGTTGATAAA ACCCTTCCCG ACCACGCGGA CAA7GACAAT 
AAST?^AA* ACAXC^ATTT TGGCAAGGCC TGGT C C CCC T GWACTGTTA 



4001 TGG7CGACAG CATGGAGACA ATGGA7ACCG GCAGGTATTG GAGTTACAGG 
ACCACCTCTC CTACCTCTGT TACCTA7GGC CCTCCATAAC CTCAATCTCC 



4051 CGTTATWlTr GCACTTATCG CTTTATTCTG CATATGCAAA TTTGTCTTXT 
GCAATATTAA CG7CAA7ACC GAAXTAAGAC A:7ATACGTTT AAACAGAAAA 



4101 ACXmrCTT CAGATTCCTT CATCCAAAAG CTCAGCCTCX AATCAATCAA 
TCAAAAACAA GTCTAACGAA CTACCTTTTC GAGTCG6AGT TTAGTTACTT 



«151 ACCAGQAtTT AATTATATCG ATTACTTGAX TCCAAGATTA CTTGACAAAT 
TCOICCSAAA TTAATATACC TAASOAACTT AGAKCTAAT GAACTCTTTA 



4301 CATAATATAA TACXCTOGAO CTTTAAACM AGCCAATGTG A7TCTAACTC 
CTATTATATT ATOTGACCtC OAAATWCTA TCCCTTACAC TAAGATTOAO 



4251 CTTTARACtC ACAOTTAATC ATAAACAAOO TTTOOTACCG ACCTCOAAIT 
GAAAmOAO TG7CAATTAG TATTTG7TCC AAAOCATGGC TCGACCTTAA 



4301 A ICTCCT 6 7G CCTTCTAOTT GCCAGCCATC TOTXCTrTGC OCCT CC CX3 C O 
TAOACGACAC GGAAGATCAA CGGTOGOTAG ACAAC AA ACG GGGAGGGGGC 



4351 9CCCTTCCTT GAOCCTCQAA 6CTGCCACSC C CACT OTCCT TTCCTAATAA 
ACOGAACGAA CTGGGACCTT CCACGGTGAC GGTGACAGGA AAGGATTATT 
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4401 MOCXaOMiK TrOCATCOCJl TPCXCTGWIT j^CCTCTCXTT C^ATTCTCCC 
TTACTCCTTT JUiCCTAOCCX XXCAflACTCX TCCACACTAX GlkTAACACCC 

4«1 CCCTOOGCTO CGGCXCC A CA> GCAXOQOOOJW COftTTCCCXX GAOUlTjfcCC^ 
CCC3iCCCCAC CCCCTCGTCX CGTTCCCCCT CCTAATCCTT CTCTTXTCCT 

ISfil CCCA2CttC0 GGMCCGOTO COCTCTATCC GTACCCACgT GCTCAXCAXT 
ttcSSftCC CCTACOCCXC CCGAGATACC C&TCCCTCCA CCACTTCTTA 

4551 WaCCCCOTT CCTCCTOCCC CXGAAACAXa CAC C CACjfcTC CCCTTCTCTG 
jyCTOCGCOUi CGAGGACCCC Ultl ' mCi ' l^^^ GTCCCTOTAO CCCAAGAOAC 

460r WACXOLCCC TCTCCACCCC CCTGCTtCTT JUSTTCCAOCC COJjCTOlTAO 
ACtCTGTGGC XCAGGTCCCO QGACCAAfiAX TCXXCGTCGC CCTGXGTATC 

4»r Q^ci^^ GCTCAGCXCC GCTCCCCCTT CAX^CCOJ^ CCKAAJ^ 
CrGTG?WS7XT CGAGTCCTCC CCJICOCGOAA CTTAGGGTGC CCGATTTCXT 

47er'cWGGXCCM ICTCTCCCTC CCCCATCAOC CCXC CAXAC C JJJJ^ 

(aju:crrcocc AGxaxfiGoiiG cgactactcc cGTCGrrrcG totggxtcgo 

4751* TCCJUUyiGTG AAXOCAXCXT ATWCTATTAA CTGCXCJl^ 

JUOTTCTCAC CCTTCTXTXA TTTCCTTCTA TCCCATAATT CXCCTCTCCC 

IftDl XGAGAAAAT6 CCTCCAXCAT CTGAGGAXCT AXTGACACAA ATCX7AGAAT 
TCTCKKAC CcIgGTTCTA CACTCCTTCA TTACTCTCTT TACTATCTTA 

WCTTCCGCT TCCTCC3CTCA CTGACTCCCT GCGCTCGCTC CrXCGGCTOC 

aScgcga aggagccagt gactgagcga cccgagccag caagcccacg 

«01* GCCCAGCGGT ATCAGCTCAC TCAAACGCGG TAATACGGTT ATCCACACAA 

£9^1* KAGGGM ACGCAfiOAAA CAACATCTCA CCAAAACCCC AC CAA&AG GC 
IS^^AT TCCOTCCTW CroCTACACT CCTCTTCCCC J^<^^^^jyj^^ ^ 

iftfll CASGAXCCGT AAAAAGGCCC CGTTGCTCGC GnTTTCCAT ACGCTCCCCC 
S?^5tcgS tttttcccSc GCAACGACCG CAAAAAGOTA '^<^^^_ 

SOSl CCCCT6ACGA GCATCACAAA AATCGACOCT CAAGTCACAC GTGCCCAAAC 

/lOl ' ccSACACGAC TATAAAOATA CCAGGCCTTT CCCCCTOGAA CCTCCCTCOT 

5*151 ' CeeCTCTCCT GTTCCGAOCC TOCCGCTTAC CGGAtACCTG TCCCCCTTTC 

5ifil * TCeCTTCOGG AAOCOTOCCC CTKCXCAXT CCICACC CT G TACCTATCTC 
5201 JCKTTTOGG J^^gJoCOC CAAACAGTTA CCAOTOCOAC ATCCATA»0 

525i* *iffinCC0TGT AGCTCCTTCO CtCCAAOCTC CCCTOTOTCC ACQAACCCCC 
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5301 ccrrcjtBccc cac c q c t cc c ccraATcccc JJActxtcgt cttcactc« 

OCMOICGGC CTOQCGACGC CCJOXMOCC XTTQATAOCX GXACTCACCT 

sUl VcCCGOTJUlC ACACOACrrX TCCCCACTCC CXOCXO g XC TOOTAXCJ^ 
-CCGCCATTC TGTCCTGAM ACCGGTG^CC CTCOTCCCTO AOCXTTCTCC 

5401 XTTAOCAfiJtC CGAGCTATOT ACCCOGTOCT ACAGXCTTCT TGAAGTCCTO 
SmCCICTC CCTCCATACA TCCOCCACCX TGTCTCAAfiX ACTTCACaC 

S4S1 OCCTAACTAC OCCTACXCTA CAXOGACAW XTTTCCTATC TGCCCTCTOC 
CCGMTCATC CCGATGTGJCT C X TC C T C TCA 7MACCXTAQ ACCCGAGACC 

5S0r *TOMCCCAGT TACCTTCCCA AAAAGAOTTO GTACCTCTTG XTCCGGCAAA 
ACTTCOGTCA ATGCAACCC7 TTTTCTCAAC CATCGACAAC TAGGCCGTTT 

5551 CAJOOCACCC CTGGTACCCG TOCTTTOTT CCTMCA A^ j!£S£5T?5£ 
CTTTGGTOeC GACCATCOCC ACCAAAAAAA CAAACGTXCC TCGTCTAATG 

SfiOl OCGCA&AAAA AAACGATCTC AACAACATCC TTTCATCTTr TCTACGGGCT 
OGCCTCTTTT TTTCCTACAfl TTCTTCTACO AAACTAGAAA AGATCCCCCA 

SfiSl CTCACCCTCA CTGGAACGAA AACTCACGTT AACCGATTTr GCr?CAT»CA 
CACTGCSACT CACCTTGCTT TTGAGTOCAA TTCCCTAAAA CXACTACTCT 

5701 TfATCAAAAA GGATCTTCAC CTAGATCCTT ^ AAACT AAA W^^JAGTrT 
AATACTTTTT CCTAGAACTC GATCTACGAA AATTTAATTT TTACTTCAAA 

57S1 TAAACCAATC TAAACTATAT ATGAGTAAXC TTGGTCTOAC AGTTACCAAT 
ISSjOTAG ATX-ICATATA TACT^ 

SaOl GCTTAAKAC TGAGGCACCT ATCTCAGCCA TCTGTCTATT TCCTTCATCC 

«5l a-xcsttgcct cactccgsso gggccgggcg ctgaggtctg cctcctgaag 
w^caamS ctcagocccc ccccccccgc gactccagac cgagcacttc 

MOi JLArMTGTTGC TCACICATAC CAGGCCTGAA TCCCCCCATC ATCCACCCAG 
M51 AXXrr^GAGGG AGCCACCOTT CATGAGAGCT TTGTTCTAGG TCCACCAGTT 

cVor'oi^i^i^xii^'i^^^ nss^ «?2Sisg? ss^sjj 

CCACTAAAAC WGAAAACGA AACCCTGCCT TCCCAGACCC AACAGCCCTX 
«0M* OMCCOTGAT CXCATCCTTC AACTCAOCAA AAOTTCCATT TA3TCAACAA 

siVvvscecTC eCCTCAAGTC AGCG7AATGC TCTGCCAOTG TTACAACCAA 

Tcoaii^^ 

/lir iTAAcakkra ajuictcxtcg accxtcaaat gmjjctgc^ 

AmSgGOTUI OACTAMCTI TT«SJUBl»eC tCQT&GTCTX CTTTOACOTT 
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€251 CTAXTGAXfiO XCJUOACTCA CCOMWCXCT TCCAT ACgy ? GCCAXGXTCC 
CATTACTTCC TCTTPTCAff? GGCTCCCTCA ACCTATCCIA CCGTTC7ACC 



6301 TCGTXrCCGT CTCCOMTCC QU Z TC O K C li ACATCXATAC XXCCTATTAA 
ACOVITAOCCX CACGCTAAOG CTCACCXGC7 TCTXCTTATG TTGGATAXTT 



6351 TTTCOCCTCC TCAAAAXTAA GGTTXTCAAC TGXGAAATCA CXATGACTGA 
AXAflGCCXCC AGTTTTTMT CCAMACTIC ACTCTTTAGT GCTACTCACT 



HindXZZ 

6401 CGACTGAXTC COGTGACAAT 6CCAXAACCT TATGCATTTC TTTCCACACT 
CCTCJICTTAC CCCACTCTTA CCOITTrCGX ATACGTAXAO AAAGCTCTGA 



6451 TGWCAACAC GCCAGCCATT ACCCTCGTCX TCAAAATCXC TCOCATCAXC 
ACJIAGTTCTC CCSTCGGCAA TGCGXCCACT AGTTTTAGTG ACCCTAGTTC 



PTOI 

6501 CAAACCCTTA TTCATTCCTG ATTGCCCCTG AGCCfcCACGA AA7ACGCGAT 
CTTTCGCAAT AAGTAAGCAC TAACCCOGAC TCCCTCTGCT TTATGCCCTA 



PvuZ 

6551 CCCTGTTAXA AGGACAATTA CAAACAGGAA TCGAATGCAA CCGGCCCAGG 
CCGACAATrr TCCTCTTAAT CTTTCTCCTT ACCTTACGTT CCCCGCGTCC 



6601 AACACTCCCA GCCCATCAAC AATATTrPCA CCTCAATCAC CATAl'iUlXC 
TTGTCACGGT CCCGTACT7G CTATAAAAGT CCACTTAGTC CTATAAGAAO 



6651 7AATACCTGG AATGCTCTTT TCCCCGGGAT CGCAGTGCTC AGTAACCAT6 
ATTATGGACC TTACGACAAA ACGOCCCCTA GCGTCACCAC TCATCGGTAC 



6701 CATCATCAGG AGTACGGATA AAATCCTTGA TCGTCGCAAG AGGCATAAAT 
GTACTAGTCC TCATGCCTAT TTTACGAACT ACCAGCCTTC CCCCTATTTA 



6751 TCCCTCAGCC AGTTTACTCT GACCATCTCA 7CTGTAXCAT CATTCG CAAC 
ACGCAGTCCG TCAAATCACA CTGGTACAGT ACACATTCTA CTAACCCT7G 



6801 CCTACCTTTG CCATGTOTCA GAAACAACCC TGGCGCATM GGCTTKCAT 
CGXTGGAAAC GGTACAAACT CTTTGTTCXG ACCCCGTAGC CCGAAGCCTA 



6851 ACAATCGXTA 6ATTCTC0CA CC7GATTCM CGAOJ^^ S^^^SI 

tctxacctat ctaacaccct ggactaacgg cctgtaatag cgctccggta 



Xbol 



6901 T-ATACCCAT ATAAATCAGC MCCATCTTO gJJWTAATC CCGGOCTMA 
AATATOGGTA TATTTAGTCO TACOTACAAC CTTAAATTAO CGC C CGACCT 



XbOX 



6951 GCAACACGTT TCCCOTTCAA «TCCCTCAT Aj^ g XC CCC CT «^"A«CT 
CCrrCTGCAA AOGCCAACTT ATACCOAGXA WGTO0C«X CA7AATCACA 



7001 TTAIGTAJUX: AGACAirPTO ATTGTOA 

AATACXTTCG TCTCTCAAAA TAACAACTAC TACTATATAA AAATAGAACX 
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pnZZZ 



70S1 CCJJireTAAC ATCAOKEaTT TTGMACACX ACCTOOCTTT CCCCCCCCCC 
CCTXACATrC TAOTCTCTJLX JOCTCTCTOT TOCACCOAAX CCGGGGGOGO 



7101 CCATTATTQA ACCXTTTATC ACCCtrrATTO rCTCXTGAGC GGXTACXTAT 
CaCMSMCt TCGTAAATAC TCCCAATAAC JLCACTACTCC CCTATCTATA 



7151 TTGAATOTXT KAGAAAAAT AAACAAATAC CGGTTCCCCC CACATTTCCC 
AACTTACATA AATCTTTTTA TTTOTTTATC CCCAAfiOCGC CTCTAAAOfiW 



7a01 CCAAAASTGC CAOCTOACGT CTAAOAAACC ATTATTA3CA TGACATTAAC 

ccrrracAco ctgoactgca cArrcTTWC iaataatact actgtaattc 



7251 CTA7AAAAAT ACGCCTATCA CGACGCCCTT TCOSC 
GAXATTTTTA TCCGCATACT CCTCCGGGAA ACCAO 
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pTO 10X2-309 tXI 

6«naral Dewlplion 

B9f3k pVK aOZ3-SCP(Z) 
Xoeal abject 

Czwtcdi 09/14/98 04i39nc 

iMt MOdlticdt 09/15/98 04t50PK 

l«=9tht 7273 bp 

Bterae* ^P*' Bule 

fesBi dxaOar 



RestricdonMap 

KndUtlsHe ^^^S 
Hpatlriie gnSJi 
KpntlsJte ^« 

Pvufclsite |^][§g 

XbatlSTte 
Xhotl^e 1^ 
EcoRV:26ltes g^^^Jg 

N<leI:2$Jt€S ggj* 
Sphl:2 sites 

Functional Map 

COS(4slonaIs) 

CMV IE 5*157 

Staztt 889 EDdt 1139 
CMVtEINT 

Starrs 1130 SaAt 1840 
TbQH 

starts 4289 Sad' 4B«1 
Kanr 

Starfet fi337 Eaflt €95S (COB5f>la»cnt«y) 
MisQjeaiure (2 signals) 
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CMV enhancer 

starts a48 Cads 86S 
SQP(Z) 

starts 1870 Cndt ^288 

Annotations 
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X TCOeCCOTTT CCSTGMCAC CCTCJJULACC TCTGACACXT GCAOCTCCCO 
JLCCGCCCJUU^. OOCACTACTO CCACTTTPGC AOACTCTGTA CCTCOAOCGC 



51 CACXCGG7CX CA flCT TOlC' OTAACCGGAT C CC O C OCrA CACAXQCC CC 
CTCTCCCAfiT GTCGMCAfiA CXTTCCCCTJl COGCCCTCCT CTGTTCGCCC 



101 tCAGCOCGCC TCAGCCaetG TtCGCGGGTC TCGCCOCTGG CTTAACTATC 
ACTCCCCCCC ACTCQCCCAC XACCCCCCAC ACCCCCGACC GAATTCXTAC 



151 CCGCATCACX GCACATTGTA CTG3U3ACTCC ACCATXTGCG CTGTGAAXTA 
CCCCTACTCT CCTCTAACAff OACTCTCXCG TCCTATACCC CACACTTTAT 



201 CCOCACACAT GCCTAAGCAC AAAATACCGC ATCAQATTOC CTAWCCCCA 
GGCCTGTCTA CGCATTCCTC TTTTATGGCC TAGTCTAXCC GATAACCGCT 



251 TTGCATACGT TGTATCCATA TCAXAA2AT0 TACATTTATA TrGGCTCATQ 
A^iCCTATGCA ACATACGTAT ACTATTATAC ATGTAXATA7 A^CCCAGTAC 



301 TC CAACATTX CCGCCATGTT GACATIGATT ATTGACXXCT TATTAATACT 
ACCTTGTXAT GGCCCTACAA CTGTAACTXA TAACTCATGA ATAATTATCA 



351 AATCAATTAC CCCCTCATTA GTTCXTAGCC CATATATCGA CTTCCCCCTT 
77AC7TAATC CCCCACTAAT CAAMATCGC CTATATACCT CAAGCCGCAA 



4.01 ACATAACTTA CCGCaAATCG CCCCCCTGGC TCACCGCCCA ACGACCCCCG 
TCTATTOAAT GCCATTTACC CCGCGGACCC ACTGGCGCGT rrCCTCGGGGC 



iSl CCCATTGACC TCAATAkTGA CGTATGTTCC CATAG7AACC CCAATACGGA 
GGCTAACTCC AGTTATTACT CCATACAACO GTATCATTGC CCTTATCCC? 



is 01 CTTTCCATXC ACGTCAATGG CTGGAGrATT TACGGTAAAC TGCCCACTTC 
CAAAGCTAAC TCCA6TTACC CACCTCATAA ATCCCATTrC ACGGGTGAAC 



ladeZ 



551 GCACTACATC AAGTCTATCA TATGCCAACT ACCCCCCCTA TTGACGTCAA 
CGTCATCTAG TTCACATAC? ATACGCTTCA TCCGGGGGXT AACTOCAGTT 



COl TCACGGTAAA TGCCCCGCCT GGCATTATOC CCAGTACATC ACCTTATGGG 
ACTGCCATTT ACCGGGCCGA CCCTAATACG GGTCAXCTAC TCGAATACCC 



KOOZ 

€51 AcrrrccTAC woccwtac atctacotat tagtcxtcgc tattaccatc 

TCAAACGATO AACCGTCATC TAGATCCATA ATCAOTAGCC ATAATGCTAC 



BeOZ 

701 roGAtCCOGT TPTOOCACtX CA^CJA^ SSI^IJS 

CACZAOGOCX AAACCCTCAT CT&GTZAOOC GCACCTATCG CCAAACTQftG 



751 ACCCGGATrP CCAACTCTCC ACCCCATTCA CCTCAATCGG ACTTXCTTTT 
MCCCCTAAA CGCTCauaGC TGOGCTMCT GCACTTACCC 9CAAACAAAA 



eOl GGCACCAAAA TCAACOGCAC CTIC CXAAA T CTCCT AACAX CTCCGCCCCA 
CCCTGOTTTX AGTTGCCCTG AXXG6777TX CAOCATTCTT GAGCCGGGOT 
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6S1 TTCACCCAAA TCGGCCCTAO CCCTGTACCC TGGGAGCTCT ATATXXCCM 
JUCTGCCTTT ACCCOCCAtC CGCACATCCC ACCCTCCAGA TATArrCCTC 

901 ACCTCCTTTA CTCAACCCTC AOATCCCCTO GACA C C CC AT CCACGCTCTT 
TCGACCAAAT CACTTGCCAC TCTACCGGAC CTCTOCGCTA GOTGCCACAA 

saezx 



151 TTGACCTCCA TAGAAOACAC CGGGACCGAT CCACCCTCCC CGGCCGGGAA 
AACTCGAGCT ATCTTCTGTG GCCCTGGCTA CG7CGGACGC GCCCSCCCTT 



aOOl CGGTGCA7IO CAACCCCCAT 9CCCCGTGCC AAGAGTGACG 7AACTACCGC 
CCCACCTAAC CTTCCCCCTA ACGOGCACCG TTCTCACTGC AnCATGGCC 



SpihZ 

1051 CTATAGACTC TA7ACCCACA CCCCTTTCGC TCTTATCCAT GCTATACTGT 
CATATCTCAS ATATCCGTGT CCGGAAACCG AGAATACGTA CGATATGACA 



1101 TrrrCGCTTC CGGCCTATAC ACCCCCGCTT ccttatccta tagctcatoc 
AAAAjCCGAAC CCCGGATATG TCCGGGCCAA GGAArACGAT ATCCACTACC 



1151 TATAGCTTAC CCTATACGTG 7CCGTTATTC ACCATTAtTG ACCACTCCeS 
ATATCGAATC GGATATCCAC ACCCAATAAC TGCTAATAAC MGTGAGOGG 

1201 TATTGCTCAC GATACTTTCC ATTACTAATC CATAACATGG CTCTTrCCCA 
ATAACCACTG CTATGAAAGG TAATGATTAC CTATTG?ACC GAGAAACGGT 



1251 CAACTATCTC TATTGGCTAT ATGCCAATAC TCTGTCCTfC AGAGACTGAC 
GrrCATACAG ATAACCCATA TACCG77ATC AGACAGGAAG TCrCTGACTC 



1561 ACGGACTCTG TATTTTTACA GGATCCCGTC CCATTTATTA TTTACAAATT 
^GCCTGAGAC A7AAAAA.TGT CCvACCCCAG CGT.VLATAAT AAATGTTTXX 



1351 CACATATACA ACAACCCCGT CCCCCGTGCC CGCAGTTTTT ATTAAACATA 
CTCTATATCT TGTTCCGGCA CGGGGCACCG GCGTCAAAAA TAATTTCTAT 



1401 CCGTGGGATC TCCACGCGAA TCTCGGCTAC GTCTTCCGGA CATGGSCTCT 
CGCACCCTAG AGOTGCGCTT AGACCCCATG CACAAGCCCT GTACCCGAGA 



1451 5CTCCGCTAG CCCCGGAGCT TCCACATCCG ACCCCTGGTC CCATGCCTCC 
AGAGCCCATC CCCGCCTCGA AGCTCTA60C TCGGGACCAG GGTACGGAGG 



1501 AGCGCCTCAT COTCGCTCOO CAGCTCC77G CTCCT AACAQ TGGAGGCCAO 
TCCCOQAOTA CCASCCAOOC GTCGAGGAAC GAOCA T TQIC ACCTOCGGVC 



1551 ACTTACCCAC ACCACAATOC CCACCACCAC CAOMTOCCC CACAXCCQOG 
TCAATCCCTC TCCTCTTACG CCTCGTGGTC GTCACACOGC 6TCTTCCGCC 



IfiOl TCOCGGTACG GTATCTGTCT GAAAATCAGC GTCCAGATrC GQCTCGCACO 

aScgocatcc catacacaca cttttactcc cacctctaac ccGACcoroc 



1551 GCTCACGCAC AJrCGAATSACT TAACGCAGCC G CAGAAGAAG ATCCACGCAC 
MACtScTC TACCTTCOTA AMCCCTCCC CGTCTTCrrC TACGTCCCTC 



1701 CTGAGTTCrr GTATTCTGAT AAQAGTCACA OGTAACTCCC CTTCCGCTCC 
GAC7CAACAA CATAAGACTA TOCTCACTCT CCATTGAGCG CAACGCCACO 
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«751 TCOTWkCCCT GGACCGCAOT CTAGTCTGWS CACTXCTCCT TCCTGOCCO 

aSwxcccx cctcccgtcx cmpcacxctc ctcxtgacca accaccqccc 

" "** Ncol 

IBM eaCCSaiCCJl 6A»TAA35X6 CreACASACT AACXCACTST TCCTTTCCXT 
S^CTCOT CCCTATTATC GACTGTCKSA WCTCXGACA AGGAAWWtA 

Vk:^ Bait EcoR^-I 

IBSl OeCICTTTXC ICCACTCACC CtCOTCOkCJl CGTCTOJiaCX CATXTCCCCa 
SoWUaO ACCICAGWC CASCACCTGT CCACACTACI CTATACCCCC 

tOM eeecrcTASX ccacgcgcct ccxtcgjatt gatgaxcatt aacccgac^ 

SSStCT bCT^GGA CCTAGCCTAA CTACT^^ 

^Mi ' TclccerAX*^ CTTCXTCtCT CTTAGATTAT TTGTTTTCCA QACTAGGGCT 

S^^; S3[^A0A CAATCTAAIA AACAAAAGGT CTCATCCCCA ^ 

ifici * 'esTCACGTCC TTrrCXAlCC TGTAACCAAA ATAAACTOCA CTAGAASCAT 
S3[^??S5g SSStTACC ACATTCGTTT 7ATTTGAG5T f^^^j^^^ 

5*051 * xnGTGOSX AACAACACAA TGOGCCrTAC AGOAAHATTO CA0TTACC7C 
llicACcSG TTOTCGTCTT ACCCGCAATG TCCTTATAAC CTCAATCCAO 

Moi * CTGATC6ATT CAACAGGACA TCATTCTICC TTTGCGTAAT TATCCTTTTC 
Skaktaa cSctcctct AGTAACAAAG AAACCCATIA ATAGGAAAAG 

SS g^A ISg^aSSc toaaccccac taggtckat cwctaatot^ 

J2oi * *GGTTACTGAT CTCGACAAAC TACTTTGTC6 TGACAAACTO TCATCCACAA 

cSatcactI cagctctttc atcaaacacc actctttgac actaggtctt ^ 

•jUi * *-*-cIvrTGAC ATCACTMGA CTGAATCTCC AAOGCAASGG ACTGGCAACT 

JISSaS^ gacttagagc "CccJJ^^ . 

2*aor*CMKicCCAT CTGCAACTAA AAGATGGGOC "CJ^GT^O STSIS^SiS 
CmScGGTA GACGTTCATT rrCTACCCCfl aactccaggc cacaggotco 

an^ joAOOtoIwc AK^ CTcareAXTo cgct caaaac tcctacaatc 

tttcSmab ttaatacttc oaccacttac J^^ff^^^J^^^ 

SSJSSt JttwSSwc cccs^^ 
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aSOl CJUCGTCTCC Vi tAJa JiTCT QATACTCCCC ^NACCTJAGX ^SGJ^JJCTT 
CnCCACIUGC AACGTJUaa^ CTATOACCCG CrrCCMTCT 

a«r *ckGC»lACAC CCCTTCXGXO ACCCOGTCAX T^J^CCGAC ^JCMCTCTj^ 
CTCCACTCTC CGGAACTCTC TCGGCCAflTT XCOTTGCCTC CTCCCCAOAT 

EcoBV 

2701 CTOGCTXCTA TTCTACCACA ATTXGXTXtC AGGCTXCCGG TmO»Xtt 
CACCC31TGAT AAflATCGTGT ^XXTCTATAG TCCCXTOGCC AAAACCTTOO 

2751* JOMUaUM ACTOCHCTT CCWIOCTOAC ^ATTTCJLCCT JCCTCOJ^ 
TTACTCTCTC TCATCAACAX CCTCCAXCTO TTAXACTCGA TGCACCTTGA 

MOl TGAXTCAAGA TTCACACCJ^C XCTTTCTCCT CCXCCTOAXT CACACAXTXT 
iS»fiTTCT SctGTCCTG TCAAACACCX CCJTCCfcOTA CTCTG7TXTA 

/asi* ATXCAACTGG GAXXXGGACC AATACCACGC CAXAAMXJIT TIC GAAG STC 
^ ATCrrCACC CTTTTCCTCG TTATGG7CCC CTXTTGASTA AACCTTCCAO 

20Q1 AXCCCCCAAA CTGATAC AAC AATCCGGGAC TGGGCCTTCT ggC AAAC TAA 
■ TTGGMCTTT AACTATCTTG «ASCCCCTC ACCCCGAAGA CCCrrrGArT 

29S1 AAAAACCTCA CTAGAAAAAT TCCCAGTGAA GAGrPCTCTT TCACAGllCT 
tttwgSIgt GATCTTTTTA AGCGTCACTT CTCAACAGAA AGTGTCAACA 

300i"aTCAAACGCA GCCAXAAACA TCAGTCCTCA GACTCCGGCC CGAACrrCTT 
TACTTXCCCT CCCTTTTTCT AGTCACCAGT CTCAGGCCCC GCTIGAAGJiA 

305^ VcCACCCACG GACCAXCACA ACAACTCAAG AOCACAAAAT CATCCCTTCA 
CGCTCCGTCC CTGGTTGTGT TGTTGACTTC CGGTCCTTTA CTACCGAACT 

aiOl GAAAATTCCT CTCCAATGGT TCAAGTCCAC ACTCAAGGAA GG GAAC CTGC 
CTTTTiScGA GACCTTACCA AGTTCACGTG TCAGTrCCTO CCCTTCGACC 

aiSi* AGTGTCCCAT CTAACAACCC TXCCCACAAT CTCCACGAGT CCCCAATCCC 
TmSgCGTA GATTCr^GGG AACGGTGTTA GACCTCCTCA GGGGTTAGOO 

^201 <"CACAACCAA ACCAGGICCG CACAACAGCA CCCATAATAC ACCCCTOTAT 
^ S$^CCTT TGCTCO^ CTGTtCTCGT CCGTATTATG TGGCCACATA 

5551 AAAC7TGACA TCTCTCACGC AACTCAACTT CAACAACATC ACMCAGAAC 

aVor ^Juaia^^^ agcxcaocct com«ratt "cr^jco J^^j^^cc 

' r U AV l TGC T O TCGTCTCCGA OCCTOTGACC GAGACCGTOC TCGCG7CCGC 
CACCCCCAAA AGCAGAGAAC ACCAACACO A CCAACABCAC TCXCMCCTC 

3401 * GACXiaaa^ ^«c c»aaac cacxgcctca o^m^ 

CTGOCGCGGT CCTGTXCTTC ACCCGfTTTTC GTGTCGCTCT CGCaACCGTT 

usioj^^ JSSSiSf^ j^^jtSSSSt 

CrrCTTGTGA CTACTCGPTC TATGOCCTCT TCTCTCACCC TCCTCCCOCT 
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3501 *CCT*BCCTT MTCXCCMS ACTATXOCTC CACTTCCJCO ACTCJ.TCACJ; 

tceiOCOSXA TTAJWCCnai tcm**cc»c ctcaocotcc tcactactct 

a'ssi cGcasGwaa gaxctcoaxc mmccaatt ctcaxtcctc *ac^*ajj|IC 
C CT A^: i " : - '''^' ctTQAOcnc TcrccGn**. caottxcoac ttggwittac 

3*60^ CMCCCTJOT TTACXTTACT OQACtXCTCX CCXTCJOlC^ CCTCCJMlTM 
CTiaWMlA JIXTGTAATGA CCTCMPGJWCT CCTACrTCClL CCXCGTTACC 

aesi BUCTCCOCTC C&TACaiWIT TCCeeCCCAC cagcccaogc aatttacxta 
gaggSc^r SncrCOTATA aaoccccgtc ctcoociccc tcaxatgtai^ 

3701 GMBCBQCTAA MCACAATCA AGATGGTTTA ATCSOTGCCT T OACACAaC T 
ScOCOCAtr ACCTOTTACt ICWCCAAAT TAflACACCCA ACtCTGTCGA 

3*75r *ecccaukCCAfl accactcaao cTcrrcAACT cttcctcaoa gcmcaactc 

C Cffi r-l ^s f ^ tCCTGACTK GACAACTTCA CAASCACTCT CCOTGTTCAC 

saol AeciAcecAC cttttcaatc ctcaaccgta acgcaattca tttck«ctg 
tcgSmgtc caaaactxag gaottoocat tcccttaact aaagaaccac 

iflSl CACeCATGGC CCGGCACAIG CCACATTCTC GGACOGSACT eCTGXAtCOA 
StCCcScCC CCCCCTCTAC CCTGTAACAC CCTCOCCTGA CGACATAfiCT 

3901 ACCACATQAT TOGACCAAGA ACATAACAGA CAAAAT?GAT CAGATTAT5C 
TGSIcSciA ACCTCCTTCT TCTATT6TCT GTmAACTA CTCTAATAAQ 

3951* ATOJarrroT tcataaaacc cttccggacc asoggcacaa tcacaattcc 

TACmSaACA ACTAnTPGG CAAGGCCTCG TCCCCCTCTT ACTGTTAACC 



zVol' 'v^eijaJSaKS CCAfiACAATC GATACCGGCA GGTATTCGAG T7ACAGCCCT 
' " ACCIGICCTA CCTCTCTTAC C-IATCCCCST CCATAACanC xxtCTCCSCA 

/oS^ TMSAIOTCCA GrtATCGCTT TATTCTOTAT ^»GCAJi^ ^I^TTIIi?! 
ATA3TAAC6T CAATACCCAA ATAAGXCATA TACSTTTAAA CAGAAAATCA 

/lOr *«TOttCAG ATTGCTICAT GGAAAAOCTC JSCCTOkAAT WATGAAA^ 
AAAASNU?tC TAACCAAGTA CCTTTTCCAO TCCCAGTTTA OtTACTTrGG 

«Ui *Al;^^i*7ATki^TT 

TCCXAAATTA ATATACCTAA TOAACTTAOA TTCTAATCAA CTGTTTACXA 

AMI JLSTXTAXTAC AKGGAGCOT TAAACATAOC CAATCtGATX CTAACTCCTT 
mSsSS tScSSwI AMMTATCO CtTA^ 

i^Sl TAAACXCACA OWAATCATA AACAACOXTT CGAATTGAXC TCCTGTOCCT 

^neSerar Satcastas ncTTCcAAA ccrtAACTAC AasACAOcoA 
iiti' vrainfixh^^ tctttocccc Tooccca«se ctoccttcac 

AGMOAKO T^eXABACA ACAAACeCGO ACCGCGaCC CAAGGAACSO 
J«i CX3CGAAGGT CCCACTOOCA C^UII-CAXIC CTAATAAAAT CAOCAAATO O 

^SStAjS AOACtOaCC ACA^^ 
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44S1 CACCAUOai AGGCCOXQO^ TtQQQhMXC AATAGCAOOC ^TGCTQGGGX 
GfXCGZCTCQT TCCOQCTOCT AACCCTTCTG TTATCGTCCC TACGACCCCT 



4501 TQCCSrCCGC TCTXTGGGTX COCAGGTGCT CXUGAXTXCA CCCGGITCCT 
ACGCCACCCG ACXTACCCAT GGCTCCACGA CTTCTTAACT GGCCCAAGM 



<55X CCreGCCCAG AXXOAAGCAC CCACATCCCC WCTCTCTW CACACCCTOT 
CGACCCGOTC TnCTICCTC JUGAGAOICT GTCTGGGACX 



4 SOX C C A C O O CC CT CGTTCTTAOT TCCAOCCCCX CTCATXCGAC ACTCATAOCT 
CCTO C CCCGA CCAAGAATCA AGGTCCCCGT GACTA3CCTC TOAGtrATCGA 



4651 CAGGAfiGGCT CCGCCTTCAA 7CCCACCCGC TAAAOTACTT OQAGCCOTCT 
C7CCTDCCGA GOCGGAAGTT AGGOTGQGCG ATTXCATQAA CCVCGCCAGA 



4701 CTCCCTCCCt CATCACCCCA CCAAACCAAA CCTAGCCTCC AAOAC7»3GGA 
GAS9CAGGGA CTACTCCOGT CCTTTGGrrT GGATCGOAGG TTCTCACCCT 



4751 AGAAATTAAA CCAAGATAGG CTATTAAGTO CAGAGCGAGA GAAAATGCCT 

TCrmATrr cgttctatcc cataattcac gtctccctct cttttacgqa 



4801 CCAACATGTC AGGAACTAAT CA6AGAAATC ATACAATTTC TTCCGCTTCC 
CGTTGTACAC TCCTTCArtA CT C 1 C 7X TA G tTATCTTAAAG AACCCGAAGC 



4051 TCCCTCACTG ACTCGCTGCC CTCCCTCCSTT CGGCTGCCGC GACCCGTATC 
ACCGACTGAC TGAGCGACGC GAGCCAGCAA GCCGACGCCG CTCGCCATAG 



4901 AGCTCACTCA AAGC^CGGTAX TACC-^TTATC CACAGAA7CA GCCGATAACC 
TCGA5TCAGT TTCCGCCATT ATCCCAATAG GTGTCTTAGT CCCcTAl=rGC 



4951 CAG GAAAGA A CATGTGAGCA AAAGGCCAGC AAAA CCCCAG CAACCGTAAA 
G5CCTTTCTT GTACACTCG7 7?TCCGGTCG TTTTCCGCTC C7TGCCA7TT 



5001 AAGGCCGCG7 TGCTGGCGTT T77CCATACG CTCCGCCCCC C7CACGAGCA 
TTCCGCCGCA ACGACCCCAA AAAGG7ATCC GAGGCGGGGG GACTGCTCOT 



5051 TCACAAAAAT CQACGCTCAA CTCA6AGCTC GCGAAACCCG ACAGGACTAT 
AGTCTTTTTA GCTGCOAGTr CA G 7 C TCCAC COCn^GCGC TCtCCTGATA 



5101 AAAfiATACCA GGCCTTTCCC CCTGGAAGCT CCC T C CTCCG CTC7CCTGTT 
mCTATGOt CCCCAAAftCG GCAO CT TCCA QQGACCACGC GA0A6GACAA 



5151 CC CA CCCT CC COCnACCGG ATACCTGTCC OeOTTCtCC CTTCGGGAA6 
CGCTGCaACO CCGAATCGCC TATCOACAGG CQGAAAGAGG GAAGCCCTTC 



5301 CGTGGCGCTT TCTCAATGCT CACGCTCTAO GTATCTCACT TC0Q7CTAG0 
CCAKGCGAA AGAGTTACGA 6TGCGACATC CAXAGACTCA AGCCACA7CC 



5251 TCGTTCGCTC CAAGCTGGQC TG7QTOCACG AACCCCCCGT TCAQCCCCAC 
AOCAAGCQAG GTKGACCCG ACACACCWC TTGCGGOGCX AGTCGGGCTO 



5301 CCC7GCGCCT TATCCGGTAA CTATCOICTT GAOTCCAACC CGGTAAGACA 
OCGACCCGGA XTAOGCCATT GATACCAGAA CTCAGGTTGG GCCATTCtC? 
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5351 CCACTTATOO CCACWGCAfi CAGCCIUCTOO TyCAQGXTT 9^0KGCGX 
GCTOAXUGC GGTOACCCTC GTCGCTQACC XT5GTCCXXA TCCTCTCCCT 

5401 CCtaSCTAfiO CCCTQCT A CJL CAOTTCTTGX ACtCOTCCCC TXACTA C CQ C 
CcSSxTCC CCClUWaWtOT CZCJUUUACT TCACCACCCO ATTCXMCCO 

5451 TACRCtMJU^ CGACAGTATT TOG»TCTOC CCTCTCCTCX AOCCACTTAC 
ATCXCJltCre CCTG2CAIAA ACCWXCACC COAflACGACT TCCCTCAATC 

5501 CTXCQGAAXJl ACACTTGGTA CCTCTT C ATC CGC CJLAXCAA ACCACCGCTO 
f^y ^j^ 'i w iwp T C T C AACCAT CGACXACTJUS CCCGTTTCTT TGGTGCCQAC 

5551 CTMCCCTOa I ' ilArilGXT TGCAXCCW3C ACATTACCCC CAGAXXXAAA 
CAKGCCACC AXAAAAACXX ACCTTCGTCC 2CTAATCCGC CTCTTTTTTT 

5601 OGKICTCAXC AJUSMCCTTT GATCTTTTCT ACCOGGTCTO ACGCTCAGTC 

CCMGJU;TrC tictaooaxjl ctacwu^aga tcccccacac tocgagtcac 

5551 GAACGMUaC TCACOTTAXO OGATrTTGGT CATGACXTTA T CAAAAXG CX 
C ' ^OLiiA ' AO XOTGCAXTTC CCTAAAACCA CCACTCTAAT AGTTTTTCCT 

5701 TCTTCACCTA CATCCTTTTA AATTAAAAAT GAACTTTTAA XTCAATCTAA 
ACAACTGGA? GTAGCAAAAT C?AAr i TT7A CTTCAAXATT TAGTTAGATT 

57S1 ACTATATATC ACTAAACTTG CCCTGACAGT TACCAXTGCT '^^TCAGTGA 
TCATATATAC TCATTTGAAC CAGACTCTCA ATCGTTACCA ATTAGTCAC? 

SaOl CCCAOCTATC TCAGCGATCT GTCTATTTCG TTCATCCATA GTTTCCTGAC 
^ ScGATAC AGTCCCTAGA CAGATAAAGC AACTAGCTAT CAACCG^ 

sftsl tcccccgggc gogcccgctg aggtctgcct cgtgaacaag ctcttcctca 
jSsccccccc cccccccgac tccagacgca ocacttcttc cacaacgact 

5901 CrCATACCAG CCC7GAATCG CCCCATCATC CACC CAGAAA ^^^^^^^SSX 
GACISATGCTC CGCACTTACC GGGGTAGTAG GTCGCTCTTT CACTCCCTCG 

5451 CACCG7TGAT GAGACCTTIC TTCTAOGTGC ACCACTTGG7 CA77TTCAAC 
CrraACTA CTCTMAXAC AACATCCACC TGCTCAACCA CTAAAACTTC 

cooi' TmccrrTQ ccacgqaacg crcrGccrPG tccggaj^ SESlSilSS 

WUUUXAAAC GGTCCCTTGC CAGACCCAAC ACCCXTTC^ 

fiOSl ATCCrrCAAC TCACCAAAAG TTCCATTTAT TCAACAAAOC CGCCGTCCCG 
ciSoTTG AOTCOTTTTC AACCTAAATA ACTTOTTTCC GCCCCAGGGC 

«01 TCAACTCAGC GTAATCCTCT CCCACXCTTA CAAOCAASTA ACCAATTCtC 

ClSl ATTACAAAAA CTCATCGACC ATCAAATGAA ACTGCAATTT ATTCAIATCA 
IISctTOT SStaOCTCG TACTTTACW 

MOl CCARAICAA TACCATATTT TICAAAAAOC COTTTCrGTA AJ CAAGCAQA 
SImOT SgcStAAA AAC^^ 

AA&CTCACCG AGCCAflnCC ATAGCAMCC AAGATCCTOO XATCOGlCTa 



-42- 



wo 99/37331 




PCT/US99/01382 



C301 CGKnOCGAC TCCTCCAACA CTA7TAATTT CCC C T CC T C X 

GCXAAGGCTG ACCAGGTXCT A8TTXTCTT0 GATAXTTAAA CCCGXGCAGT 

S3S1 JUUUCSAAGOT TATCAACTGX OAAXTCACCX TOACTCXCOX CTC&A.TCCOC 
TTPraXTCCA ATACTTCACT CTTTAOTGCT ACTCACTCCT GACTTACGCC 

HindTTI 

6402 TGAGAATOOC AAAAGCTTAT OCATTXCTTT CCACACTTGT TCAACACCCC 
ACTCTTACCC TTTTCCAAIA CGTAAAOAAA CCTCTGAACA AGCTGTCCCG 

6451 AOCXASTACG CTCOTCATCA AAATCACTCG CATCAA C CAA ACCCTTATTC 
TCCCTIIAXCC GACCAGTAGT TTTAGTGAGC CTAOTTCCrT TGGCAATAAO 

FVUX 

6501 ATTCGTCXTT 0C6CCTGAGC CACACQAAAT ACGCGATCCC TGTTAAAAGO 
TAAOCACTAA CCCCGACTCC CTCTOCTTTA TOCCCTACCG ACAATTTTCC 

6551 ACAATTACAA ACACGAATCO AATCCAACCC GCGCACGAAC ACTCCCAGCG 
TGT7AAT0TT TCTCCTTACC TTACCTTCGC CCCCTCCTTG TGACGGTCCC 

6601 CATCAACAAT ATTTTCACCT CAAT CACCAT ACTC TTCTAA TACCTGGAAT 
GTACrrCTTA 7AAAAGTGGA C77AGTCCTA TAACAACaTT ATGGACCOTA 

6651* GCTCTTTXC CCGGGATCCC AGTCCTGAGT AAOOJlTC^ EJ^SSSIST 
CGACAAAACC GCCCCTAGCG TCACCACTCA WGGTACGTA CTACTCCTCA 

6701 ACCSAIAAAA TGCTTCATOG TCGGAAGACC CATAAATTOC GTCAGCCAGT 
TCCmATTTT ACCAACTACC AGCCTTCTCC GTArTTAAGG CAGTCGCTCA 

6751 TTACTCTGAC CATCTCATCT CTAACATCAT TGGCAACGCT ACCTTTGCCA 
StCACACTG GTACAGTAGA CATTCTACTA ACCCTTGCGA TGGAAACCG7 

6801 TGratCACAA ACAACTCTGC CGCA7CCCGC TTCCCATACA ^^^GATAGAT 
ACAAACTCTT TCTTGAGACC GCCTAGCCCG AAGGGTATCT TACCTATCTA 

6651* TGTCSCACCT GATTCCCCGA CATTATCGCG AGCC«OTTA TACCCATAT^ 
ACASCGTGOA CTAACGGGCT GTAATACCGC TCCGGTAAAT ATGCCTATAT 

xhol 

6901 AATCAGCATC CATCTTGGAA TTTAATCGCG GCCTCmCCA JGJCCTT?^ 
TTAGTCCTAG CTACAACCTT AAATTAGCGC CGGACCTCOT TCTGCAAACC 

tfttSl crSTTCXAIAT CGCTCAffAAC ACCCCTTOTA TOACTOTOTA TGTAAfiCAOA 
cSScttSa CcSSStTC TOOOOAACA2P AATCACAAAT ACATTCCTCT 

7001 * ttGTTTTATx'cPr^ ATATATTTTT XTCTTCTOCIl J^«Jj«;ajTO 

GTCAAAATAA CAACTACTAC TATATAAAAA TAGAACACGT XakCATTGTAC 



DraXXX 



7051 AGACATTITG AGACACAACG TCOCmCCC CCCCCCCCCA TTATT G^ACC 
TCTCZ21AAAC TCTCTGTTCC AOCGAAACCO CGCCCOGGGX AATAACTTCQ 



7101 ATTXATCACG GTTATTOTCT O^TCO^^ ?2SiIiTTIS 

TAAATACTCC CAATAACACA GTACTCCCCT ATCTATAMC TTACATAAAT 
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71SI CJUJUmTAAX CJOXTAOOOO TTCCGCOCAC XTTTCCOCGX AAACTCCCAC 
CnTTTATTT OmATCCCC 3UUIQ CGC CTC 9JUU10GGGCT TrTCACOGTO 



7201 CTOA CG T C T X AOAMCCXTT ATTATCAMX CXRAXCCTA TXAAAXTASO 
6ACTGCJUSAT TCTTTGOTAA TAATJkGTACT GTAATTCCXT A.T777TATCC 



72S1 CCTATCACGA CC CCCTI T CG TC 
OC&TAGTGCT CCQCdUUCC AO 
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